CL 1.4 Quantitative Methods For Business
CL 1.4 Quantitative Methods For Business
4
QUANTITATIVE
METHODS FOR BUSINESS
CL 1.4:
Quantitative Methods for Business
The Governing Council of CMA reserves the right to make any amendments itdeems necessary during the
period covered herein.
©Copyright reserved.
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by
means, electronic, mechanical, photocopying, recording or otherwise without the prior written permission of
the Institute of Certified Management Accountants of Sri Lanka.
Published by:
Institute of Certified Management Accountants of Sri Lanka
29/24, Visakha Private Road,
Colombo 04, Sri Lanka.
A Word to the Students
This text book is designed as a guide to lead Certificate Level students in the study of CL 1.4
Quantitative Methods for Business (QMB) examination paper. It is carefully prepared to
cover the syllabus content, giving comprehensive explanations in each section including
exposure in answering questions.
Note that examination success will depend not only on your knowledge, but also on your
ability to present what you have learnt, in response to the given questions, within the
specified time period.
You may refer to the ‘Examination Guide’ published for each level and be familiar with the
CMA Examination Policies and Guidelines for Computer Base Examinations; Examination
paper structure, Hierarchy of Taxonomy (Actions verbs) and Pilot papers.
Make every effort to understand the subject and develop the skills to apply your knowledge.
Knowledge is the theoretical and practical understanding of a subject.
Application is the ability to use knowledge in a given relevant situation. This is the ability
to select the appropriate principles and/or techniques and apply them to relevant
information from a range of data.
Learning Outcomes
Classification of numbers is very important for identifying the solutions in various kinds of
applications of mathematics in various fields including business and economics. Therefore,
in this chapter we will provide a basic idea about the classification of numbers and some of
the properties of their operations.
The set of Natural Numbers denoted by 𝑁 is the set of counting numbers. The set of Natural
Numbersis also called positive integers.
𝑁 = {1, 2, 3, …}
The set of natural numbers and zero is called the set of Whole Numbers.
𝑊 = {0,1, 2, 3, …}
CHAPTER 1 Page 6
The natural numbers together with 0 and negative values, -1, -2, -3… form the set of integers
denotedby 𝑍.
The set of rational numbers consists of numbers such as 1/3,7/5, which can be written as
a ratio (quotient) of two integers. That is, a rational number is one that can be written as 𝑝⁄𝑞
, where 𝑝 and q are integers and 𝑞 ≠ 0. Thus, the set of rational numbers can be represented
as,
𝑝
𝑄 = {𝑥|𝑥 = , 𝑝, 𝑞 ∈ 𝑍 and 𝑞 ≠ 0}
𝑞
All rational numbers can be represented by terminating decimals such as 3/2 = 1.5 or by
non- terminating repeating decimals such as 4/11 = 0.363636… Numbers represented by
non-terminating and non-repeating decimals are called the irrational numbers. An
irrational number cannot be writtenas an integer divided by integer.
Examples
The union of the set of rational numbers and irrational numbers will form the set of real
numbers denoted by 𝑅.
Algebra is a branch of mathematics that deals with mathematical symbols and the
manipulation of these symbols to represent relationships and solve equations. Here are
some basics of algebra:
• Linear Equation: Linear equation is an algebraic equation in which the highest power of
the variable is 1. The general form of a linear equation is 𝑎𝑥 + 𝑏 = 0, where 𝑎 and 𝑏 are
constant terms.
• Quadratic Equation: Quadratic equation is an equation in which the highest power of the
variable is 2. The general form of a quadratic equation is 𝑎𝑥 2 + 𝑏𝑥 + 𝑐 = 0, where 𝑎, 𝑏 and
𝑐 are constant terms.
1.3 Indices
In mathematics, indices (also known as exponents or powers) are used to represent repeated
multiplication of a number or variable by itself. An index is a small number written placed to
the upper right to a base number. It indicates the number of times the base number is
multiplied by itself.
For example, in the expression 23 , the base number is 2, and the index (or exponent) is 3.
This means that 2 is multiplied by itself three times: 2 × 2 × 2 = 8.
• Power Rule: When raising a number with an index to another index, multiply the indices.
For example, (𝑎3 )2 = 𝑎3×2 = 𝑎6 .
• Zero Rule: Any non-zero number or variable raised to the power of zero equals 1. For
example, 𝑎0 = 1.
• Negative Rule: A negative index is equivalent to the reciprocal of the number or variable
1
raised to the positive index. For example, 𝑎−2 = 𝑎2 .
Illustration 1.1
• 23 × 24 = 23+4 = 27 = 128
23 1 1 1
• = = =
25 25−3 22 4
1
• 3 = √3
2
The rules of exponents have many uses in the simplifications of algebraic expression.
Illustration 1.2
1) 3𝑥 − 9 = 12 , Find 𝑥 2) 2𝑥 + 3 = 5𝑥 − 4 , Find 𝑥
3𝑥 = 12 + 9 2𝑥 − 5𝑥 = −4 − 3
3𝑥 = 21 −3𝑥 = −7
𝑥=7 𝑥 = 7⁄3
When two or more equations are satisfied by the same values of the unknowns, they are
called simultaneous equations. We can solve two simultaneous linear equations by the
method of elimination or the method of substitution.
Illustration 1.4
3𝑥 + 7𝑦 = 27 → (1)
5𝑥 + 2𝑦 = 16 → (2)
29𝑦 = 87
∴𝑦=3
2𝑥 − 5𝑦 = 1 → (1)
7𝑥 + 3𝑦 = 24 → (2)
From (1),
1+5𝑦
𝑥= 2
Now we eliminate 𝑥 in (2) by its value obtained from (1). Then we have
7(1+5𝑦)
+ 3𝑦 = 24
2
∴ 7 + 35𝑦 + 6𝑦 = 48
∴ 41𝑦 = 48 − 7 = 41
𝑦=1
1+5(1)
𝑥 = 2 = 6⁄2 ∴ 𝑥 = 3
Therefore, the complete solution is 𝑥 = 3, 𝑦 − 1
CHAPTER 1 Page 10
1.4.2 Linear inequity
Illustration 1.5
1) 2𝑥 + 3 < 5 2) 5 ≥ 7𝑥 − 9
2𝑥 < 5 − 3 −7𝑥 ≥ −9 − 5
𝑥<1 𝑥 ≤ (−14)⁄(−7)
𝑥≤2
A quadratic equation is an equation of the form of 𝑎𝑥2 + 𝑏𝑥 + 𝑐 = 0, where 𝑎, 𝑏 and 𝑐 are the
real numbers. Quadratic equations can often be solved by factoring the quadratic polynomial
and setting each factor equal to zero.
Illustration 1.6
1) 𝑥2 − 4 = 0 2) 𝑥2 + 2𝑥 − 3 = 0
(𝑥 − 2) (𝑥 + 2) = 0 𝑥2 + 3𝑥 − 𝑥 − 3 = 0
Hence, either 𝑥 − 2 = 0 or 𝑥 + 2 = 0 𝑥 (𝑥 + 3) − 1(𝑥 + 3) = 0
∴ 𝑥 = 2 or 𝑥 = −2 (𝑥 + 3) (𝑥 − 1) = 0
Hence,
either 𝑥 + 3 = 0 or 𝑥 − 1 = 0
∴ 𝑥 = −3 or 𝑥 = 1
3) 𝑥2 − 6𝑥 + 9 = 0 4) 𝑥2 + 7 = 0
𝑥2 − 3𝑥 − 3𝑥 + 9 = 0 𝑥2 = −7
𝑥 (𝑥 − 3) − 3(𝑥 − 3) = 0 The equation does not have real
(𝑥 – 3) (𝑥 − 3) = 0 numbered solutions.
∴𝑥−3=0
𝑥=3
CHAPTER 1 Page 11
Not all quadratic equations have real number solutions, and even those that do have are often
difficult to solve by factoring. It is possible to tell whether or not a quadratic equation has
real number of solutions, and to find them when they exist, by means of the quadratic
formula.
• If 𝑏 2 − 4𝑎𝑐 > 0, the equation has two distinct real number solutions.
Illustration 1.7
1) 𝑥 2 − 5𝑥 + 6 = 0 2) 𝑥 2 + 8𝑥 + 16 = 0
We see that a =1, b= -5, and c=6. We see that a =1, b= 8, and c=16.
−(−5)±√(−5)2 −4(1)(6) −(8)±√(8)2 −4(1)(16)
Hence, 𝑥 = Hence, 𝑥 =
2(1) 2(1)
−8+
− √64 − 64 −8 ± 0
5+ − 24 5±1 𝑥= =
− √25 2 2
𝑥= =
2 2
∴ 𝑥 = −4
∴ 𝑥 = 3 or 𝑥 = 2
3) 2𝑥 2 + 5𝑥 − 1 = 0 4) 𝑥 2 + 4𝑥 + 5 = 0
We see that a =2, b= 5, and c=-1. We see that a =1, b= 4, and c = 5.
−(5)±√(5)2 −4(2)(−1) Here,
Hence, 𝑥 =
2(−1)
𝑏 2 − 4𝑎𝑐 = 42 − 4(1)(5) = −4 < 0
−5+
− √25 +8 −5 ± √33
𝑥= = Therefore, this equation does not have
4 4
real number solution.
−5 ± 5.745
𝑥=
4
∴ x = 0.186 or x = −2.686
CHAPTER 1 Page 12
1.6 Graphing Functions
A function having defining equation of the form 𝑦 = 𝑚𝑥 + 𝑏, where 𝑚 and 𝑏 are real numbers,
is called a linear function. The graph of a linear function is a straight line. Since two points
determine a straight line, we need only to find two points that lie on the line.
If the graph of a function intersects the 𝑦 - axis at the point (0, 𝑏), we say that 𝑏 is the 𝒚 –
intercept, if the graph intersects the 𝑥 - axis at the point (𝑎, 0), we say that a is the 𝒙 -
intercept. If the equation of a straight line is 𝑦 = 𝑚𝑥 + 𝑏, we say that 𝑚 is the slope of the
line.
Note: Two lines are parallel if they have the same slope.
Illustration 1.8
If 𝑥 = 0, 𝑦 = 4 then 𝑦 - intercept = 4.
If 𝑦 = 0, 𝑥 = 3 then 𝑥 - intercept = 3.
A quadratic function has the equation of the form 𝑦 = 𝑎𝑥 2 + 𝑏𝑥 + 𝑐 where 𝑎, 𝑏 and 𝑐 are
real numbers and the graph of a quadratic function is a parabola. The parabola has a single
vertex, that is, a lowest or highest point of the parabola. Each parabola is symmetric with
respect to a vertical line through its vertex.
CHAPTER 1 Page 13
The graph of the quadratic function 𝑦 = 𝑎𝑥2 + 𝑏𝑥 + 𝑐,
Illustration 1.9
1) 𝑦 = −𝑥 2 + 4𝑥 + 12
−𝑏 −4 𝑏2 42
= = 2 𝑎𝑛𝑑 𝑐 − = 12 − = 16
2𝑎 2(−1) 4𝑎 4(−1)
Hence the parabola has its vertex at (2,16). The 𝑦 − 𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡 is (0,12).
To find 𝑥 − 𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡𝑠 we solve −𝑥 2 + 4𝑥 + 12 = 0.
Thus, we have (𝑥 − 6)(𝑥 + 2) = 0;
∴ 𝑥 = 6 or 𝑥 = −2.
Hence, 𝑥 − 𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡𝑠 are (6,0) and (-2,0).
2) 𝑦 = 4𝑥 2 − 20𝑥 + 25
Here, 𝑎 = 4, 𝑏 = −20, and 𝑐 = 25. The parabola is concave up since 𝑎 = 4 > 0, has its
vertex at
−𝑏 𝑏2 5
( 2𝑎 , 𝑐 − ) = (2 , 0) and has y – intercept at (0,25). Using quadratic formula to solve
4𝑎
the equation 4𝑥 2 − 20𝑥 + 20 = 0, we have,
3) 𝑦 = −3𝑥 2 + 6𝑥 − 4
The parabola is concave down, has its vertex at (1, −1) and has y-intercept at (0,4).
Using the quadratic formula to solve the equation−3𝑥 2 + 6𝑥 − 4 = 0, we obtain,
Thus, the equation −3𝑥 2 + 6𝑥 − 4 = 0 does not have real number solutions because
the discriminant 𝑏 2 − 4𝑎𝑐 = −12 < 0. Therefore, the parabola does not have x –
intercepts. Choosing 𝑥 = 2 arbitrarily we find 𝑦 = −4. Hence the point (2,4) lies on
the parabola.
CHAPTER 1 Page 15
1.7 Exponential and Logarithmic Functions
Let 𝑏 be any positive real number except 1. A function whose defining equation is of
the form 𝑦 = 𝑐𝑏𝑟𝑥, where 𝑐 and 𝑟 are real numbers with 𝑐 ≠ 0 and 𝑟 ≠ 0 is called an
exponential function with base 𝑏.
Illustration 1.10
𝑦 = 2𝑥
𝑥 -3 -2 -1 0 1 2 3
𝑦 1/8 1/4 1/2 1 2 4 8
2) 𝑦 = 2−𝑥
𝑥 -2 -1 0 1 2
𝑦 4 2 1 1/2 1/4
CHAPTER 1 Page 16
Suppose a company's sales were Rs. 100,000 in 2000 and have been doubling every 5 years.
This is a situation of exponential growth. If the sales have been becoming one-half every 5
years, the situation is an exponential decline. Exponential growth functions are of the form
𝑦 = 𝑐𝑒 𝑟𝑥 and exponential decline functions are of the form 𝑦 = 𝑐𝑒 −𝑟𝑥 where 𝑐 > 0 and 𝑟 >
0.
CHAPTER 1 Page 17
Illustration 1.11
If inflation persists, the purchasing power of the rupee declines exponentially. Let y
denote the purchasing power of Rs. 1, in today's terms, 𝑡 years from now, and suppose
that inflation persists at an annual rate of 15%.
Let y=𝑐𝑒 −𝑟𝑥 be the exponential decline function. Since 𝑦 = 𝑅𝑠. 1/− when 𝑡 = 0 years,
we have 1 = 𝑐𝑒 0 ∴ 𝑐 = 1.
Thus, 𝑦 = 𝑒 −𝑟𝑡
When t = 1 year, the rupee will be worth 15% less than it is now, that is Rs.0.85.
∴ 0.85 = 𝑒 −𝑟 , Solving for r, we have 𝑟 = 0.16 (logarithma transformation can be used
to solve, 𝑟 = −𝑙𝑛(0.85)
∴ 𝑦 = 𝑒 −0.16𝑡 .
(ii) Find the purchasing power of the rupee after 9.3 years.
When t = 9.3, we have 𝑦 = 𝑒 −0.16(9.3) = 0.23.
∴ Purchasing power declines by 77% in 9.3 years.
1.7.2 Logarithms
If 𝑏 is a real number and suppose 𝑏 > 1 such that 𝑥 = 𝑏𝑦. The real number 𝑦 is called the
logarithm to the base 𝑏 of 𝑥 and is denoted by 𝒚 = 𝒍𝒐𝒈𝒃 𝒙 with 𝑥 > 0.
1) log 2 8 = 3 because 23 = 8
1 1
2) log 3 27 = -3 because 3−3 = 27
3) log10 10 = 1 because 101 = 10
4) log e 1 = 1 because 𝑒 0 = 1
CHAPTER 1 Page 18
a) Rules of Logarithms
Let 𝑏 be real number such that 𝑏 > 1, and 𝐴 and 𝐵 denote positive numbers. Then
Note:
The logarithm to the base 10 is called the common logarithm and is written 𝑙𝑜𝑔. The
logarithm to the base 𝑒 is called the natural logarithm and is written 𝑙𝑛. Common and
natural logarithms can be evaluated using a calculator or a table.
For example, 𝑙𝑜𝑔 2 = 0.3010 𝑙𝑛 2 = 0.6391
Illustration 1.12
Given that log 2 = 0.3010, log 3 = 0.4771, and log 5 = 0.6990 use the rules of logarithm
to find the following.
2
4) 𝑙𝑜𝑔 (9) 𝑙𝑜𝑔 2 − 𝑙𝑜𝑔 9 = 𝑙𝑜𝑔 2 − 2𝑙𝑜𝑔3 = 0.3010 − 2(0.4771) = −0.6532
Logarithm to a number to any base other than 10 or 𝑒, can be calculated using the formula,
𝑙𝑜𝑔 𝑥
𝑙𝑜𝑔𝑏 𝑥= 𝑙𝑜𝑔𝑎𝑏 where a > 1, b > 1 and x>0
𝑎
CHAPTER 1 Page 19
in particular,
log 𝑥 ln 𝑥
𝑙𝑜𝑔𝑏 𝑥 = =
log 𝑏 ln 𝑏
Example
log 5 ln 5
𝑙𝑜𝑔3 5 = = = 1.46497
log 3 ln 3
𝑦 = 𝑙𝑜𝑔2 𝑥 => 𝑥 = 2𝑦
𝑦 -3 -2 -1 0 1 2 3
𝑥 1/8 1/4 1/2 1 2 4 8
Arithmetic progression (AP) is a sequence of numbers in which the difference between any
two consecutive terms is constant. This constant value is known as the common difference
and denoted by “𝑑”. In an AP, the common difference ensures a consistent increment or
decrement between consecutive terms, leading to a predictable pattern.
From these results, it can be seen that the 𝑛𝑡ℎ term is,
𝑇𝑛 = 𝑎 + (𝑛 − 1)𝑑
Examples:
Illustration 1.13
Find the 18th term of the arithmetic progression whose first term is 15 and the common
difference is 5.
Answer.
a = 15, d = 5, n = 18 T18 =?
T25 = 15 + 85
T25 = 100.
Illustration 1.14
Find the first term of an AP if the 15th term is 35 and the common differences – 3.
Answer:
Tn = a + (n-1) d
35 = a + (15-1) (-3)
35 = a - 42
a = 77
CHAPTER 1 Page 21
Illustration 1.15
Tn = a + (n-1) d
22 = -81 + (n - 1) (3)
22 = -81 + 3n - 3
n = 32
Illustration 1.16
The first term of an arithmetic progression is 52 and its 30th terms is - 35, find the
common difference of the progression.
Tn = a + (n - 1) d
- 35 = 52 + (30-1) d
- 35 - 52 = 29 d
d = -3
Illustration 1.17
Answer.
𝑇𝑛 = 3𝑛 − 2
Substituting, n = 1 T1 = 3 (1) - 2 = 1
Substituting, n = 2 T2 = 3 (2) -2 = 4
Substituting, n = 3 T3 = 3 (3) - 2 = 7
T2 - T1 = 4 -1 = 3
T3 - T2 = 7 - 4 = 3
Hence, the above series is an arithmetic progression whose first term is 1 and the value
of common difference is 3.
CHAPTER 1 Page 22
Illustration 1.18
The sum of 4th and 10th terms of an arithmetic progression is 20 and the difference of
20th and 10th terms is 50. Find the first term, common difference, and the value of the
40th term.
Answer.
T4 + T10 = 20 a + 3d + a + 9d = 20
2 a + 12 d = 20
a + 6d = 10
(1)
10d = 50 d=5
a = - 20
Illustration 1.19
The terms -8, x2, and 17x are the first three terms of an arithmetic progression, find the
value of 'x’.
x2 - (-8) = 17x - x 2
x2 + 8 = 17 x - x2
2x2 - 17x + 8 = 0
2x2 - 17x+ 8 = 0
(2x- 1) (x-8) = 0
1
𝑥= 𝑜𝑟 𝑥 = 8
2
CHAPTER 1 Page 23
1.8.1 Sum of the First n Terms of an Arithmetic Progression
Let 𝑆𝑛 denote the sum of n terms of an arithmetic progression, and it can be shown that,
𝑛
𝑆𝑛 = 2 (𝑎 + 𝑙).
This formula can be used to compute the sum of AP if the last term and the first term of AP
are known.
If the last term of AP is unknown but the first term and the common difference are known,
the formula below can be used to compute the sum of first n terms.
𝑛
𝑆𝑛 = {2𝑎 + (𝑛 − 1)𝑑}
2
Illustration 1.20
An arithmetic progression whose first terms is 4 and the last term is 124 and it has 13
numbers of terms. Find the sum of this series.
Answer
a = 4, l = 124, n = 13
𝑛
Substituting in, 𝑆𝑛 = 2 (𝑎 + 𝑙).
13
𝑆13 = (4 + 124).
2
𝑆13 = 832.
Illustration 1.21
25
𝑆25 = {2(−1) + (25 − 1)(10)}
2
𝑆25 = 2,975
CHAPTER 1 Page 24
Illustration 1.22
In the arithmetic progression 48, 45, 42, ..., what is the minimum number of terms
that need to be taken to achieve a sum of 405?
Answer.
a = 48, d = -3, Sn = 405, n =?
𝑛
𝑆𝑛 = {2𝑎 + (𝑛 − 1)𝑑}
2
𝑛
405 = {2(48) + (𝑛 − 1)(−3)}
2
810 = 𝑛(99 − 3𝑛)
𝑛2 − 3𝑛 + 270 = 0
(𝑛 − 18)(𝑛 − 15) = 0
𝑛 = 18 𝑜𝑟 𝑛 = 15
𝑇𝑛 = 𝑎𝑟 𝑛−1
Examples:
3, 6, 12, 24, … 𝑎 = 3, 𝑟 = 2
Answer:
6
𝑎 = 2, 𝑟 = 2 = 3, 𝑛 = 7 ⇒ 𝑇7 =?
Substitution, 𝑇𝑛 = 𝑎𝑟 𝑛−1
𝑇7 = 𝑎𝑟 6 = 2(3)6 = 1,458
Illustration 1.24
The first term of a geometric progression is 40. Fourth term is −5. Find the 11th term
and nth term.
Answer:
𝑇4 = −5
𝑎𝑟 3 = −5
1 1
40𝑟 3 = −5 ⇒ 𝑟 3 = − ⇒ 𝑟 = −
8 2
1 10 1 5
𝑇11 = 𝑎𝑟 10 = 40 (− ) = 40 × =
2 1024 128
1 𝑛−1
𝑇𝑛 = 𝑎𝑟 𝑛−1 𝑇𝑛 = 40 × (− )
2
CHAPTER 1 Page 26
Illustration 1.25
The second term of a geometric progression which has a positive common ratio is 4 and
4th term is 8. In this series obtain
(a) Let’s consider for the first term is 𝑎, the common ratio is 𝑟 of a geometric progression
which fulfills the given requirements.
𝑇2 = 4 ⇒ 𝑎𝑟 1 = 4 (1)
3
𝑇4 = 8 ⇒ 𝑎𝑟 = 8 (2)
𝑎𝑟 3 8
(2) ÷ (1) = 4 ⇒ 𝑟2 = 2
𝑎𝑟 1
𝑟 = ±2
Illustration 1.26
3, 𝑥 and (𝑥 + 6) are first three terms of a geometric progression with positive of all
terms.
Answer:
(a) 3, 𝑥, (𝑥 + 6), …
𝑥 (𝑥+6)
Since this geometric progression, the common ratios, 3 and should be the same,
𝑥
𝑥 (𝑥 + 6)
∴ =
3 𝑥
CHAPTER 1 Page 27
𝑥 2 = 3(𝑥 + 6)
𝑥 2 − 3𝑥 − 18 = 0
(𝑥 − 6)(𝑥 + 3) = 0
𝑥 = 6 𝑜𝑟 𝑥 = −3
Illustration 1.27
Dulanjith put a sum of A on an investment cam an interest at 4% per annum. The value
of investment alter 5 years will be Rs. 100,000. Compute the value of investment after
10 years into approximate rupee,
Answer:
𝐴(1.04)5 = 100000
100000
𝐴= (1.04)5
= 𝑅𝑠. 121,665.29
Fish in one area of an Indian Ocean will be reduced by 6% per annum owing to the
reason of fishing heavily. How many times will take the number of fish live in that area
to reduce exactly by an half of the numbers existing currently?
Answer:
𝑥
Suppose the existing number of fish is 𝑥. Then exactly half equals to 2.Since reducing by
6% per annum, the number of fish are in a geometric progression and the common ratio
of it is 0.94.
100 → 94
[ 94 ]
1→ = 0.94
100
If time taken to be a half is n, the number of fish at the end of that year = 𝑥(0.94) n
𝑥
∴ = 𝑥(0.94)𝑛
2
1
= (0.94)𝑛
2
By adopting logarithms to both sides.
log(0.94)𝑛 = log(0.5)
𝑛 = 11.2 𝑦𝑒𝑎𝑟𝑠
CHAPTER 1 Page 29
Illustration 1.29
The sum of the first and second terms of a geometric progression is 9. The sum of the
first, second and third terms is 21. Find the first term and the common ratio of this
series and write the first three terms.
Answer:
𝑇1 + 𝑇2 = 9 and 𝑇1 + 𝑇2 + 𝑇3 = 21
𝑎 + 𝑎𝑟 = 9 ⇒ 𝑎(1 + 𝑟) = 9 (1)
𝑎 + 𝑎𝑟 + 𝑎𝑟 2 = 21 ⇒ 𝑎(1 + 𝑟 + 𝑟 2 ) = 21 (2)
𝑎(1+𝑟+𝑟 2 ) 21
(2) ÷ (1) => =
𝑎(1+𝑟) 9
Since 𝑎 ≠ 0
(1+𝑟+𝑟 2 ) 21
(1+𝑟)
= 9
9(1 + 𝑟 + 𝑟 2 ) = 21(1 + 𝑟)
9 + 9𝑟 + 9𝑟 2 = 21 + 21𝑟
9𝑟 2 − 12𝑟 − 12 = 0
3𝑟 2 − 4𝑟 − 4 = 0
(3𝑟 + 2)(𝑟 − 2) = 0
2
𝑟 = − 3 𝑜𝑟 𝑟 = 2
2 2
When, 𝑟 = − 3 𝑏𝑦 (1), 𝑎 (1 − 3) = 9
1
𝑎 × 3 = 9 ⇒ 𝑎 = 27
𝑎=3
when 𝑎 = 3 and 𝑟 = 2 the first three terms of geometric series are 3, 6, 12, ....
2
when 𝑎 = 27 and 𝑟 = − 3 the first three terms of geometric series are 27, -18, 12, ....
CHAPTER 1 Page 30
1.9.1 A Sum of the First 𝒏 Terms of a Geometric Progression
𝑎(𝑟 𝑛 − 1) 𝑎(1 − 𝑟 𝑛 )
𝑆𝑛 = { = , 𝑤ℎ𝑒𝑛 𝑟 ≠ 1
𝑟−1 1−𝑟
𝑛𝑎, 𝑤ℎ𝑒𝑛 𝑟 = 1
𝑎(𝑟 𝑛 −1)
Note: Calculation will be easy with the formula, 𝑆𝑛 = when 𝑟 > 1 and the formula
(𝑟−1)
𝑎(1−𝑟 𝑛 )
𝑆𝑛 = when 𝑟 < 1.
(1−𝑟)
Illustration 1.30
Answer:
1,2,4,8, ….
2
𝑎 = 1, 𝑟 = = 2, 𝑛 = 10
1
𝑎(𝑟 𝑛 −1)
Since 𝑟 > 1, let us use the formula; 𝑆𝑛 = 𝑟−1
1(210 − 1)
∴ 𝑆𝑛 = = 1023
2−1
Illustration 1.31
Find the sum of the first 8 terms of a geometric progression of 4, -12, 36, -108, ...
Answer:
4{1 − (−3)8 }
∴ 𝑆𝑛 = = 1 − 6551 = −6550
1 − (−3)
CHAPTER 1 Page 31
Illustration 1.32
3
The first term of a geometric progression is 16 and the common ratio is 2 . How many
terms should be taken from this progression if the sum of these terms is 211?
Answer:
3
𝑎 = 16, 𝑟 = , 𝑆𝑛 = 211, 𝑛 =?
2
3 𝑎(𝑟 𝑛 −1)
Let use, 𝑟 = 2 > 1 ∴ 𝑆𝑛 = 𝑟−1
3 𝑛
16{( ) −1}
2
211 = 3
−1
2
3 𝑛
211 = 32 {( ) − 1}
2
𝑛
211 3
=( ) −1
32 2
211 3 𝑛
+1=( )
32 2
3 𝑛 243 35
( ) = = 5
2 32 2
𝑛 5
3 3
( ) =( )
2 2
𝑛 = 5.
The concept of the derivative is the basis of calculus. Differentiation is a method used to find
the slope of a function at any point. Although this is a useful tool in itself, it also forms the
basis for some very powerful techniques for solving optimization problems. The process of
finding the derivative is called differentiation. To denote the derivative of 𝑦 = 𝑓(𝑥) at,
followings can be used.
𝑑𝑦 𝑑
𝑓 ′ (𝑥), , (𝑥), and 𝑦 ′
𝑑𝑥 𝑑𝑥
In this section, we shall look at some of the most important rules of differentiation. These
rules are involved in completely mechanical and efficient procedures for differentiation.
If 𝑓(𝑥) = 𝑔(𝑥) ± ℎ(𝑥) and if 𝑓 ′ (𝑥) and 𝑔′ (𝑥) exist, then 𝑓 ′ (𝑥) = 𝑔′ (𝑥) ± ℎ′ (𝑥).
Rule 4 can be extended to the derivative of any finite number of sums and differences of
functions.
If ℎ(𝑥) = 𝑓(𝑥)𝑔(𝑥) and 𝑓 ′ (𝑥) and 𝑔′ (𝑥) exist, then ℎ′ (𝑥) = 𝑓(𝑥)𝑔′ (𝑥) + 𝑔(𝑥)𝑓′(𝑥).
That is, the derivative of the product of two functions is the first function times the derivative
of the second, plus the second function times the derivative of the first.
𝑓(𝑥)
Let ℎ(𝑥) = such that 𝑔(𝑥) ≠ 0. If 𝑓 ′ (𝑥)and 𝑔′ (𝑥) exist, then
𝑔(𝑥)
𝑔(𝑥)𝑓′(𝑥) − 𝑓(𝑥)𝑔′(𝑥)
ℎ′ (𝑥) =
{𝑔(𝑥)}2
CHAPTER 1 Page 33
2𝑥+1
Eg: 𝑓(𝑥) = 𝑥−1
To analyse the behaviour of a function, we must be able to draw the graph of that function
accuratelyenough to show, where the function is increasing and where it is decreasing. In this
section, we discuss the use of the first derivative of a function to determine this.
If 𝑓′(𝑥) > 0 for all 𝑥 in an interval (𝑐, 𝑑), then 𝑓(𝑥) is increasing over the interval.
If 𝑓 ′ (𝑥) < 0 for all 𝑥 in an interval (𝑐, 𝑑), then 𝑓(𝑥) is decreasing over the interval.
Let 𝑓(𝑥) be a function. The point (𝑎, 𝑓(𝑎)) is a relative maximum point for 𝑓(𝑥) if (𝑥) ≤ (𝑎)
for all values of 𝑥 near either side of 𝑎. The point (𝑏, 𝑓(𝑏)) is a relative minimum point for
𝑓(𝑥) if 𝑓(𝑥) ≥ 𝑓(𝑏) for all values of 𝑥 near 𝑏 either side of 𝑏. The function whose graph is
shown in the following figure has relative maxima at the points (4, 8) and (9, 7) relative
minima at the points (3, 2) and (7, 4). The point (5,6) is known as a point of inflexion.
CHAPTER 1 Page 34
The relative maximum or relative minimum points of a function together are also called
relative extreme or turning points. If the function 𝑓(𝑥) has a relative extreme at 𝑥 = 𝑎 , then
𝑓 ′ (𝑎) = 0 or 𝑓 ′ (𝑥) does not exist at 𝑥 = 𝑎. All the values of 𝑥 for which 𝑓′(𝑎) = 0 and also all
the values of 𝑥 for which 𝑓 ′ (𝑥) is not defined, are called critical values of the function 𝑓(𝑥).
If 𝑎 is a critical value (𝑎, 𝑓(𝑎)) is called a critical point.
To find the relative maxima, relative minima, and inflexion points of a function 𝑓(𝑥), two
commonly used tests: First Derivative Test and Second Derivative test are as follows.
Since the derivative of a function is itself a function, it too may be differentiated. When this
is done, the result may also be differentiated. Continuing in this way, we obtain higher-order
derivatives. If 𝑦 = 𝑓(𝑥) then 𝑓′(𝑥) is called the first order derivative of 𝑓(𝑥) with respect to
𝑥. The derivatives of 𝑓′(𝑥) denoted by 𝑓′′(𝑥) is called the second –order derivative of 𝑓(𝑥)
with respect to x etc.
If 𝑦 = 𝑓(𝑥)
𝑑𝑦
First derivative 𝑓 ′ (𝑥) = 𝑑𝑥
𝑑𝑦 𝑑𝑦 𝑑2 𝑦
Second derivative 𝑓′′(𝑥) = ( )=
𝑑𝑥 𝑑𝑥 𝑑𝑥 2
𝑑𝑦 𝑑2 𝑦 𝑑3 𝑦
Third derivative 𝑓 ′′′ (𝑥) = 𝑑𝑥 . (𝑑𝑥 2 ) = 𝑑𝑥 3
𝑑𝑦 𝑑3 𝑦 𝑑4 𝑦
Forth derivative 𝑓 𝑖𝑣 (𝑥) = .( )=
𝑑𝑥 𝑑𝑥 3 𝑑𝑥 4
CHAPTER 1 Page 35
The second derivative Test
Suppose 𝑓 ′ (𝑎) = 0
If f ′′(𝑎) = 0 , the test gives no information, that is (𝑎, 𝑓(𝑎)) may be a relative minimum point,
a relative maximum point, or an inflexion point (use the first derivative test).
Illustration 1.33
Use the second derivative test to examine the function 𝑓(𝑥) = 6𝑥 4 − 8𝑥 3 + 1 for
relative maxima and minima.
𝑓(𝑥) = 6𝑥 4 − 8𝑥 3 + 1
24𝑥 2 (𝑥 − 1) = 0
To perform the first derivative test at 𝑥 = 0, let’s select two arbitrary values around
𝑥 = 0 such that 𝑥 = −0.5 and 𝑥 = 0.5.
∴ 𝑓 ′ (𝑥) does not change sign, then the critical point at 𝑥 = 0 is an inflexion point.
What differentiation does is, it looks at the effect of an infinitely small change in the
independent variable 𝑥 on the dependent variable 𝑦 in a function 𝑦 = 𝑓(𝑥). In Economics,
marginal revenue (MR) is sometimes defined as the increase in total revenue (TR) received
from sales caused by an increase in output by 𝑞 = 1 unit. This is not a precise definition
though. It only gives an approximate value for marginal revenue, and it will vary if the units
that output is measured in, are changed.
A more precise definition of marginal revenue is that it is the rate of change of total revenue
relative to increases in output. We know that the slope of a function can be found by
differentiation and so it must be the case that,
𝑑𝑇𝑅
𝑀𝑅 =
𝑑𝑄
Marginal cost (MC) is the rate of change of the total cost (TC) function. In fact, in nearly all
situations where one is dealing with the concept of a marginal increase, the marginal function
is equal to the rate of change of the original function i.e., to derive the marginal function one
just differentiates the original function.
𝑑𝑇𝐶
𝑀𝐶 =
𝑑𝑞
CHAPTER 1 Page 37
Illustration 1.34
We are now ready to see how calculus can help a firm to find the quantity that maximises
profit. At this stage, we shall just use the MC = MR rule for profit maximisation.
Illustration 1.35
A monopoly faces the demand function 𝑝 = 460 − 2𝑞 and the cost function 𝑇𝐶 = 20 +
0.5𝑞 2 . How much should it sell to maximise profit and what will this maximum profit
be?
𝑑𝑇𝐶
Given 𝑇𝐶 = 20 + 0.5𝑞 2 𝑀𝐶 = =𝑞
𝑑𝑞
Since 𝑝 = 460 − 2𝑞,
𝑑𝑇𝑅
Then, 𝑀𝑅 = = 460 − 4𝑞
𝑑𝑞
460 − 4𝑞 = 𝑞
460 = 5𝑞
∴ 𝑞 = 92
Integral calculus involves the process of finding the function itself when its derivative is
known. This is the inverse process to finding the derivative of a function. The process of
finding an integral of a function is called integration. There are two kinds of integration,
indefinite and definite. We will discuss the indefinite type of integration in this section.
If 𝐹(𝑥) is an integral with respect to 𝑥 of the function 𝑓(𝑥), the relationship between 𝐹(𝑥)
and 𝑓(𝑥) is expressed as follows:
∫ 𝑓(𝑥)𝑑𝑥 = 𝐹(𝑥) + 𝐶
Where the left-hand part is read as “integral of 𝒇(𝒙) with respect to x”. The symbol ∫ 𝑑𝑥
is the integral sign, 𝑓(𝑥) is the integrand, 𝐹(𝑥) is a particular integral, 𝐶 is constant of
integration, and 𝐹(𝑥) + 𝐶 is the indefinite integral.
Rule 1
𝑑𝑘𝑥
For any constant 𝑘, =𝑘 Hence ∫ 𝑘 𝑑𝑥 = 𝑘𝑥 + 𝐶
𝑑𝑥
Illustration 1.36
∫ 3 𝑑𝑥 = 3𝑥 + 𝐶
𝑑(3𝑥+5)
Here C is used because the derivative of a constant is 0. For example, = 3𝑥
𝑑𝑥
𝑑(3𝑥−2)
as well as = 3𝑥 .
𝑑𝑥
Therefore
∫ 3 𝑑𝑥 = 3𝑥 + 𝐶 is valid for any constant C.
Rule 2
𝑥𝑛+1
𝑑( ) 𝑥 𝑛+1
𝑛+1
For any number 𝑛 ≠ 1, = 𝑥 𝑛 , hence ∫ 𝑥 𝑛 𝑑𝑥 = +𝐶
𝑑𝑥 𝑛+1
Illustration 1.37
3
𝑥 3+1 𝑥4
∫ 𝑥 𝑑𝑥 = +𝐶 = +𝐶
𝑛+1 4
CHAPTER 1 Page 39
Rule 3
𝑑(ln 𝑥) 1 1
= , hence, ∫ 𝑥 𝑑𝑥 = ln 𝑥 + 𝐶
𝑑𝑥 𝑥
𝑓 ′ (𝑥)
Hence, ∫ 𝑑𝑥 = ln 𝑓(𝑥) + 𝐶
𝑓(𝑥)
Illustration 1.38
2𝑥
∫ 𝑥 2 +1 𝑑𝑥 = ln(𝑥 2 + 1) + 𝐶
𝑥 5𝑥+8 2𝑥−9
(b) 4 − =
6 3
(c) 3𝑥 = 7 + 𝑦
5𝑥 − 9𝑦 = 41
5) The supply function for a commodity is, 𝑠 = 300𝑝, and the demand function is,
𝑑 = −200𝑝 + 150,000, where 𝑝 is the sales price per unit.
(c) 4𝑥 − 10(2𝑥 ) + 16 = 10
10)The population of the world was 3.78 billion in 1972 and 4.43 billion in 1980. Assuming
that population growth is described by an exponential function of 𝑓(𝑡) = 𝑎𝑒 𝑟𝑡 ,
(a) Find the population growth function.
(b) Find the population of the world in the year 2050.
(c) Approximately when will the population of the world reach 50 billion?
g) 𝑓(𝑥) = 𝑥 2 (𝑥 − 2)(𝑥 + 4)
h) 𝑔(𝑥) = 𝑥 2 (3 + 𝑥)3
𝑥
i) ℎ(𝑥) = 𝑥−1
−23
j) 𝑦 = √12𝑡 2
−3𝑡+2
1+√𝑥
k) 𝑓(𝑥) = 1−√𝑥
l) 𝑓(𝑥) = (𝑥 + √1 + 𝑥 2 )4
CHAPTER 1 Page 41
12)Find the indicated derivative.
𝑑3 𝑦 3𝑥−1
(a) 𝑦 = 4𝑥 3 − 5𝑥 2 + 7𝑥 − 17 ; (b) 𝑓(𝑥) = ; 𝑓′′′(𝑥)
𝑑𝑥 3 𝑥+2
13)Use the first derivative test to find all relative maximum, relative minimum, and inflexion
points for the following functions.
14)Use the second derivative test to find all relative maxima and minima for given functions.
80−𝑞
15)The demand equation for a manufacturers’ product is = , where 𝑞 is the number of
𝑥+4
units and 𝑝 is price per unit. At what value of 𝑞 there is there maximum revenue?
16)A box with an open top is to have a square base and a volume of 500 cubic centimetres. If
the material for the box cost is Rs. 5 per square centimetre, find the dimensions of the box
that minimise its cost.
Learning Outcomes
2.0 Introduction
Business statistics is a field of study that involves the collection, analysis, interpretation, and
presentation of data in order to make informed business decisions. It encompasses the
application of statistical methods and techniques to address various business-related
problems and situations. Business statistics helps businesses in a variety of ways, including:
▪ Quality Control: Assessing and monitoring the quality of products or processes through
statistical methods, such as control charts and statistical process control. This ensures
consistent quality and helps in identifying and rectifying issues.
Business statistics provide valuable insights and enable businesses to make data-driven
decisions, optimise processes, identify opportunities, manage risks, and evaluate
performance.
• Measures of Central Tendency: These measures describe the central or average value of
a dataset. The most commonly used measures are the mean (average), median (middle
value), and mode (most frequent value).
• Measures of Dispersion: These measures indicate the spread or variability of the data.
Common measures include the range (difference between the maximum and minimum
values), variance, and standard deviation.
Inferential statistics involves making inferences and drawing conclusions about a population
based on sample data. It extends beyond the descriptive analysis of a dataset and enables us
to generalise and make predictions about a larger population. Here are some key concepts
and techniques used in inferential statistics:
• Hypothesis Testing: Hypothesis testing is used to make decisions about the population
based on sample data. It involves formulating a null hypothesis and an alternative
hypothesis, collecting sample data, and using statistical tests to determine whether the
evidence supports rejecting or failing to reject the null hypothesis.
In this chapter we are focusing solely on descriptive statistics. Firstly, let's identify simple
statistical definitions.
2.3 Data
Data refers to factual information or raw facts that can be collected, measured, or observed.
It can exist in various forms such as numbers, text, images, or audio. Data serves as the
foundation for generating knowledge and making informed decisions. It can be analysed,
organised, and interpreted to uncover patterns, trends, and insights. Data plays a crucial role
in research, business operations, and everyday life, enabling us to make informed choices
based on evidence.
CHAPTER 2 Page 45
Data is classified into two broad categories:
• Quantitative data
• Qualitative data
Quantitative data refers to information that is expressed in numerical form and can be
measured or counted. It deals with quantities and numerical values, allowing for
mathematical and statistical analysis. Examples of quantitative data include numerical
measurements, counts, percentages, and ratings.
Continuous data refers to a type of quantitative data that can take on any value within a
specific range or interval. It is characterised by having an infinite number of possible values
within that range. Continuous data is typically measured on a continuous scale, such as time,
temperature, weight, height, or distance. Continuous data can be infinitely divided into
smaller and smaller units, according to the degree of accuracy. For example, temperature can
be measured with decimal precision, allowing for values like 25.5 degrees Celsius or 73.8
degrees Fahrenheit.
Discrete data refers to a type of quantitative data that can only take on specific, distinct
values. These values are typically whole numbers or categories and cannot be subdivided
further. Discrete data is often the result of counting or categorising items or events. Examples
of discrete data include the number of students in a classroom, the number of cars sold in a
month, or the types of animals in a zoo.
Interval data represents quantities where the difference between the values is meaningful
and consistent. It does not have true zero point but can have positive and negative values.
Examples include temperature measured in Celsius or Fahrenheit.
Ratio data has all the properties of interval data, but with a true zero point. Ratios between
values can be calculated, and the absence of certain values is represented by zero. Examples
include income, height, or profit.
Ungrouped data refers to a data where each individual value is listed separately without
any categorization or grouping. It presents the raw or individual values as they are collected
or observed.
Nominal data represents categories or labels with no inherent order or ranking. Examples
of nominal data include gender (male, female), marital status (single, married, divorced), or
product categories (electronics, clothing, furniture)
CHAPTER 2 Page 47
Example 2.2 – Nominal data
• Do you currently have a Facebook profile? ❑Yes ❑No
Ordinal data represents categories with a natural order or ranking. It involves qualitative
characteristics that can be ranked or ordered based on a specific criterion. Examples of
ordinal data include customer satisfaction ratings (poor, satisfactory, good, excellent),
ratings for a product (low, medium, high), or education level (high school, bachelor’s degree,
master’s degree).
Secondary data refers to existing data that has been collected by others or for a different
purpose than the current study. It is readily available through sources like research articles,
books, government reports, databases, or online repositories. Researchers utilise secondary
data to supplement their own research, saving time and resources compared to collecting
primary data. It can provide historical, comparative, or contextual information for research
purposes.
CHAPTER 2 Page 48
Examples of secondary data include census data, market research reports, historical records,
academic studies, and publicly available datasets. However, it is important to critically
evaluate the quality, relevance, and limitations of secondary data before using it in a research
study.
A frequency distribution for qualitative data lists all categories and the number of elements
that belong to each of the categories.
The variable, “Offense type”, has classified into the following categories: rape, robbery,
burglary, arson, murder, theft, and manslaughter. These seven categories are listed under
the column labelled “Offense type”, and the number of occurrences of each category is
represented by tally marks. The number of tallies for each offense is counted and displayed
in the column label “Frequency”. Sometimes, the term “Absolute frequency” is used
interchangeably with “Frequency”.
A bar graph is a graphical representation of bars, where the heights of the bars represent the
frequencies of different categories. Bars can be represented either vertically or horizontally
in a bar graph. It visually displays the same information about the qualitative data that a
frequency distribution presents in a tabular format.
A pie chart is also used to graphically display qualitative data. It involves dividing a circle into
segments that represent the relative frequencies or percentages of different categories. The
angle of each segment is determined by multiplying 3600 by the corresponding percentage.
For instance, referring to the Illustration 2.1, the angle of the “theft segment” can be
8
calculated as 360 ∗ = 1150
25
There are several similarities between frequency distributions for qualitative data and
frequency distributions for quantitative data. Firstly, let’s discuss the terminology used for
frequency distributions of quantitative data. Afterwards, we will provide examples to
illustrate the construction of frequency distributions for quantitative data.
CHAPTER 2 Page 50
Class limits, class boundaries, class marks, and class width
Class limits: Each class in a frequency distribution has two class limits. The lower-class limit
is the smallest value that can be included in the class, while the upper-class limit is the largest
value that can be included in the class.
Class boundaries: The class boundaries are the values that separate one class from another
and those help define the intervals within which data falls. Class boundaries are calculated
by subtracting half a unit of measure to the lower-class limit of a class and adding half a unit
of measure from the upper-class limit of a class. Or, we can add the lower-class limit of a class
to the upper-class limit of the preceding class and divide the sum by 2. This calculation
provides the upper boundary for the preceding class and the lower boundary for the current
class.
Class mark: The class mark represents the centre of the class. It is calculated by adding the
lower-class limit to the upper-class limit of a class and dividing the sum by 2. It is also known
as class midpoint.
Class width: The class width refers to the difference between the upper and lower
boundaries of a class. It represents the range or interval of values covered by each class in
the frequency distribution.
IQ Score Frequency
80 - 94 8
95 - 109 14
110 - 124 24
125 - 139 16
140 - 154 13
The above frequency distribution table consists of five classes: 80 - 94, 95 – 109, 110 -124,
125 – 139, and 140 - 154.
Each class is defined by a lower-class limit and an upper-class limit. For this distribution,
lower class limits are 80, 95, 110, 125, and 140, while the upper-class limits are 94, 109,
124, 139, and 154.
The lower-class boundaries are 79.5, 94.5, 109.5, 124.5, and 139.5 while the upper-class
boundaries are 94.5, 109.5, 124.5, 139.5, and 154.5.
Illustration 2.3
The price for 500 aspirin tablets were determined for each of twenty randomly selected
stores as partof a larger consumer study. The prices are as follows:
2.50 2.95 2.65 3.10 3.15 3.05 3.05 2.60 2.70 2.75
2.80 2.80 2.85 2.80 3.00 3.00 2.90 2.90 2.85 2.85
Suppose we want to group these data into seven classes. Since the maximum price is 3.15
and the minimum price is 2.50, the price range is 0.65. Each class should then have a width
equal to approximately 1/7 of 0.65, which is approximately 0.093. There is flexibility in
choosing the classes while following the guidelines. The following table shows the results
if a class width of 0.10 is selected and the first class begins at the minimum price.
2.5.2 Histograms
A histogram is a graph that displays the class boundaries on the horizontal axis and their
corresponding frequencies on the vertical axis. Each class is represented by a vertical bar,
and the height of the bar corresponds to the frequency of that class. While similar to bar
graph, a histogram differs in that it uses class boundaries along with frequencies and it has
no gaps between the bars. On the other hand, a bar graph uses categories and frequencies.
To illustrate, let’s construct a histogram for the example of aspirin prices mentioned earlier.
CHAPTER 2 Page 52
Less than type cumulative frequency distribution, also known as cumulative frequency
distribution, shows the cumulative frequencies that are less than or equal to the upper
boundary of each class interval.
More than type cumulative frequency distribution, also known as greater than
cumulative frequency distribution, shows the cumulative frequencies that are greater than
or equal to the lower boundary of each class interval.
Illustration 2.4 – Less than cumulative frequency and more than cumulative
frequency.
Descriptive summary measures, also known as descriptive statistics, are used to summarise
and describe the main characteristics of a dataset. These measures provide insights into the
central tendency, dispersion, and shape of the data. If the measures are computed for data from
a sample, they are referred to as sample statistics.
If they are computed for data from a population, they are referred to as population
parameters. Descriptive summary measures provide a comprehensive overview of the data,
allowing for comparisons, identification of patterns, and understanding the data
distributions.
Mean: The arithmetic average of the values in the dataset. Suppose x1 , x2 , ⋯ xn are the
measurements obtained for the variable X of randomly selected n individuals. Then the
sample mean of X, denoted by, x̅, is the sum of the values in the sample divided by the
number of values in the sample.
∑𝑛𝑖=1 𝑥𝑖
𝑥̅ = .
𝑛
CHAPTER 2 Page 54
The table below displays the monthly starting salaries of 12 university graduates who are
experts in the field of business. Compute the mean of the salary of selected graduates.
∑12
i=1 xi 34500+35500+ ⋯ +48000 424800
x̅ = = = = 𝑅𝑠. 35,400
12 12 12
Median: The middle value in the dataset when it is arranged in ascending or descending
order. For an odd number of observations, the median is the middle value and for even
number of observations, the median is the average of the two middle values. While the mean
is often preferred measure of central location, there are situations where median takes
precedence. This is because the mean can be susceptible to the influence of extreme small or
large data values.
1) Compute the median of the given values: 46, 42, 32, 46, 54.
32 42 46 46 54
The median is the middle value since there are odd number of observations (𝑛 = 5).
Median = 46.
There are even number of observations here (𝑛 = 12), hence the median is the
average of the middle two values.
3490 + 3520
Median = = 3505
2
Mode: The value(s) that appear most frequently in the data set.
CHAPTER 2 Page 55
The systems manager, who is responsible for overseeing the company’s network,
maintains a log of the daily occurrences of server failures. The following data
represents the number of server failures that occurred over a two-week period:
1 3 0 3 26 2 7 4 0 2 3 3 6 3
Find the mode of the number of server failures that occurred over a two-week period.
The ordered array for these data is,
0 0 1 2 2 3 3 3 3 3 4 6 7 26
Mode = 3.
Note: There are situations where a dataset may have multiple modes or no mode at all.
Geometric Mean: A mathematical measure used to calculate the average value of a set of
numbers, by taking the 𝒏th toot of the product of 𝑛 values.
1
𝑥̅𝐺 = (𝑥1 × 𝑥1 × ⋯ × 𝑥𝑛 )𝑛
The geometric mean has various applications, such as calculating the overall growth
rate of investment returns across multiple periods, population growth rate, economic
growth rate, or sales growth rate.
Suppose 𝑅1 , 𝑅1 , …, 𝑅𝑛 , are the rates of returns for 𝑛 consecutive years, the geometric mean
rate return per year is given by,
1
𝑅̅ = [(1 + 𝑅1 ) × (1 + 𝑅2 ) × ⋯ × (1 + 𝑅𝑛 )]𝑛 − 1.
The percentage change in the Russell 2000 Index of the stock prices of 2,000 small
companies was -33.79% in 2008 and 27.17% in 2009. Compute the geometric mean rate
of return per year.
1
𝑅̅ = [(1 + 𝑅1 ) × (1 + 𝑅2 )]2 − 1
1
𝑅̅ = [(1 − 0.3379) × (1 + 0.2717)]2 − 1
𝑅̅ = −0.0824
The geometric mean rate of return in the Russell 2000 Index for the two years is -8.24%
per year.
CHAPTER 2 Page 56
2.6.2 Quartiles
After arranging the data from smallest to largest, it is often desirable to divide data into four
parts, with each part containing approximately one- fourth, or 25% of the observations. The
division points are referred to as the quartiles and are defined as
𝑄1 = first quartile 𝑄2 = second quartile (also the median) 𝑄3 = third quartile
𝑛 + 1 𝑡ℎ
𝑄1 = (1 × ) data value
4
𝑛 + 1 𝑡ℎ
𝑄2 = (2 × ) data value
4
𝑛 + 1 𝑡ℎ
𝑄3 = (3 × ) data value
4
Compute the first, second and third quartiles of calories for the cereals.
Ordered dataset:
𝑛+1 𝑡ℎ 7+1 𝑡ℎ
𝑄1 = ( ) data value = ( ) data value = 2𝑛𝑑 data value =100
4 4
𝑛+1 𝑡ℎ 7+1 𝑡ℎ
𝑄2 = (2 × ) data value = (2 × ) data value = 4𝑡ℎ data value = 110
4 4
𝑛+1 𝑡ℎ 7+1 𝑡ℎ
𝑄3 = (3 × ) data value = (3 × ) data value = 6𝑡ℎ data value = 190
4 4
𝑛+1 𝑡ℎ 10+1 𝑡ℎ
𝑄1 = ( ) data value = ( ) data value = 2.75𝑛𝑑 data value = 2nd data value +
4 4
0.75 × (3𝑟𝑑 data value − 2 𝑛𝑑
data value) =31 + 0.75 × (35 − 31) = 31 + 3 = 34
𝑛+1 𝑡ℎ 10+1 𝑡ℎ
𝑄2 = (2 × ) data value = (2 × ) data value = 5.5𝑡ℎ data value = 5th data value
4 4
+ 0.5 × (6𝑡ℎ data value − 5𝑡ℎ data value)= 39 + 0.5 × (40 − 39) = 39 + 0.5 = 39.5
𝒏+𝟏 𝒕𝒉 𝟏𝟎+𝟏 𝒕𝒉
𝑸𝟑 = (𝟑 × ) 𝐝𝐚𝐭𝐚 𝐯𝐚𝐥𝐮𝐞 = (𝟑 × ) 𝐝𝐚𝐭𝐚 𝐯𝐚𝐥𝐮𝐞 = 𝟖. 𝟐𝟓𝒕𝒉 𝐝𝐚𝐭𝐚 𝐯𝐚𝐥𝐮𝐞 = 8th data
𝟒 𝟒
value + 𝟎. 𝟐𝟓 × (𝟗𝒕𝒉 𝐝𝐚𝐭𝐚 𝐯𝐚𝐥𝐮𝐞 − 𝟖𝒕𝒉 𝐝𝐚𝐭𝐚 𝐯𝐚𝐥𝐮𝐞)= 44 + 𝟎. 𝟐𝟓 × (𝟒𝟒 − 𝟒𝟒) = 44
CHAPTER 2 Page 57
2.6.3 Measures of Dispersion
By comparing multiple datasets, it is possible for the averages to be the same. However, the
variables within these datasets may differ significantly in magnitudes. As a result, the central
tendency calculated from these variables may not be the most typical or representative
measure in many cases. To understand the extent of spread or variation around these averages,
we need to consider other measures, such are dispersion.
Range: The range is the simplest measure of dispersion, defined as the difference between
the minimum and the maximum value in a data distribution. It provides a rough estimate of
dispersion, but it is influenced by the extreme values rather than considering all the items in
the dataset.
Inter-Quartile Range: By eliminating the lowest 25% and the highest 25% of items in a
series, we are left with the central 50%, which are ordinarily free of extreme values. Inter
quartile range (IQR) is computed by deducting the value of first quartile (𝑄1 ) from the value
of third quartile (𝑄3 ).
𝐼𝑄𝑅 = 𝑄3 − 𝑄1
𝐼𝑄𝑅 = 𝑄3 − 𝑄1 = 44 − 34 = 10
CHAPTER 2 Page 58
Variance and standard deviation: Two commonly used measures of dispersion that take
into account all the data values in a distribution are the variance and the standard deviation.
The variance is the average of squared differences between each data point and the
mean of the dataset. It quantifies the overall spread of the data points around the mean. A
higher variance indicates greater variability in the dataset. The standard deviation is the
square root of the variance.
For a sample data, variance and standard deviation are denoted by 𝑆 2 and 𝑆, respectively.
∑𝑛
𝑖=1(𝑥𝑖 −𝑥̅ )
2 ∑𝑛 2
𝑖=1 𝑥𝑖 −𝑛𝑥̅
2
Sample variance =𝑆 2 = =
𝑛−1 𝑛−1
∑𝑛
𝑖=1(𝑥𝑖 −𝑥̅ )
2 ∑𝑛 2
𝑖=1 𝑥𝑖 −𝑛𝑥̅
2
Sample standard deviation = 𝑆 = √𝑆 2 = √ =√
𝑛−1 𝑛−1
10, 12, 9, 11, 14, 15, 13, 16, 18, 17, 16, 17
Find the variance and standard deviation of the monthly revenue for the given data.
(𝟏𝟎+𝟏𝟐+𝟗+𝟏𝟏+𝟏𝟒+𝟏𝟓+𝟏𝟑+𝟏𝟔+𝟏𝟖+𝟏𝟕+𝟏𝟔+𝟏𝟕) 𝟏𝟔𝟖
Mean revenue = = = 𝐑𝐬. 𝟏𝟒 (𝐢𝐧 𝐦𝐢𝐥𝐥𝐢𝐨𝐧𝐬)
𝟏𝟐 𝟏𝟐
∑𝒏 ̅ )𝟐
𝒊=𝟏(𝒙𝒊 −𝒙 (−𝟒)𝟐 +(−𝟐)𝟐 +(−𝟓)𝟐 +(−𝟑)𝟐 +𝟎+𝟏𝟐 +(−𝟏)𝟐 +𝟐𝟐 +𝟒𝟐 +𝟑𝟐 +𝟐𝟐 +𝟑𝟐 𝟗𝟏
Variance = 𝒏−𝟏
=
𝟏𝟏
=
𝟏𝟏
= 𝟖. 𝟐𝟕
∑𝒏𝒊=𝟏(𝒙𝒊 −𝒙̅ )𝟐
Standard deviation = √ = √𝟖. 𝟐𝟕 = 𝐑𝐬. 𝟐. 𝟖𝟖 (𝐢𝐧 𝐦𝐢𝐥𝐥𝐢𝐨𝐧𝐬)
𝒏−𝟏
Coefficient of variation measures the relative variability of a dataset compared to its mean.
It is expressed as a percentage and is used to compare the variability of different datasets,
particularly their means are greatly different. The higher coefficient of variation indicates
that the corresponding group of data is more variable or less homogeneous.
𝑆
𝐶𝑉 = × 100%
𝑥̅
Illustration 2.14
In this case, Option B has a higher CV of 80%, indicating a higher level of relative risk or
volatility compared to Option A with a CV of 40%. This information can assist investors in
assessing the trade-off between potential returns and associated risks when making
investment decisions.
Shape refers to the pattern of the distribution of data values across the entire range of values.
A distribution can be either symmetrical or skewed. In a symmetrical distribution, the values
below the mean are distributed in the same manner as the values above the mean. This
balance between low and high values creates symmetry. On the other hand, in a skewed
distribution, the values are not symmetrical around the mean. This asymmetry leads to an
imbalance either in the lower values or the higher values. In a symmetric distribution, mean,
median and mode are coincident. For a distribution,
The following Figure depicts three data sets, each with a different shape.
Skewness and kurtosis are two shape-related statistics. The skewness statistic measures the
extent to which a set of data is not symmetric. A symmetric distribution has a skewness value
of zero. A right-skewed distribution has a positive skewness value, and a left-skewed
distribution has a negative skewness value.
The kurtosis statistic measures the relative concentration of values in the centre of the
distribution of a data set, as compared with the normal distribution. That is, data sets with high
kurtosis tend to have heavy tails, or outliers. Data sets with low kurtosis tend to have light tails, or lack of
outliers.
CHAPTER 2 Page 60
Chapter 2 - Reinforcement Problems
1) Classify each of the following as descriptive statistics or inferential statistics.
a) The average filling volume, percent of under-filling, average number of machine
breakdowns, and average number of particles per bottle are measured in a soft drink
bottling process.
b) Ten percent of the boxes of cereal sampled by a quality technician are found to be less
than the labelled weight. Based on this finding, the filling machine is adjusted to
increase the amount of fill.
c) The Daily FT gives several pages of numerical quantities concerning prices and trading
volumesof stocks listed in the Colombo Stock Exchange.
d) Based on a study of 500 graduates by a social researcher, a magazine reports that 25%
of all graduates are employed in the public sector.
3) In a national survey, 2000 schools were mailed questionnaires, and one of the questions
asked for the number of toilets in each school. Out of the 2000 surveys mailed, one
thousand of the surveys were completed and returned. What is the variable in the study
and how large is the data set?
6) Indicate scale of measurements (ordinal, nominal, interval, or ratio) of the following variables:
racial origin, monthly phone bills, Fahrenheit and centigrade temperature scales, military
ranks, time taken to finish a 100m race, ranking of a personality trait, clinical diagnosis of
breast cancer, and calendar numbering of the years.
7) The following list gives the academic ranks of the 25 female faculty members at a small
liberal arts college: Give frequency and percentage distributions for these data.
CHAPTER 2 Page 61
8) The table below gives the frequency distribution for the cholesterol values of 45 patients
in a cardiac rehabilitation study. Give the lower and upper-class limits and boundaries as
well as the class marks for each class. Also, draw a histogram and an ogive for the frequency
distribution.
9) Compute (arithmetic) mean, median, mode, quartiles, range, interquartile range, variance,
standard deviation, and coefficient of variance for following datasets.
a) 15, 10, 30, 8, 20, 25, 30, 30, 20, 15, 8, 20, 20
b) 1, 6, 8, 3, 2, 5, 7, 4, 3, 3, 4, 5, 6, 6, 4, 5, 4
10) If a person received a 20% raise after one year of service and 10% raise after the second
year of service, find the average percentage raise per year.
11) The marks obtained by 500 students in a mathematics examination have been grouped
into class intervals, such as 1- 20, 21-40, 41-60, 61-80, and 81-100. The ogive for these
marks is shown below. (CHANGED and NEW GRAPH ADDED)
12) In a tea-bag-filling process, the weight of the tea in individual bags is a quality
characteristic of interest. The label on the tea package indicates an average weight of 5.5
grams per bag. However, achieving the exact amount of tea in each bag is challenging due
to various factors such as temperature and humidity variations within the factory,
differences in tea density, and the high-speed nature of the filling process. Below is a
sample of the weights of 20 tea bags produced within one hour by a single machine.
5.57 5.40 5.53 5.54 5.55 5.62 5.56 5.46 5.44 5.51
5.65 5.44 5.42 5.40 5.53 5.34 5.54 5.45 5.52 5.41
13) The following data represents the total fat (in grams) for burgers and pizzas from a
sample of fast-foodsellers.
Burgers 35 19 31 34 43 39
Pizza 22 9 15 7 16 33 18 25
a) Compute mean, median, first quartile, and third quartile for burgers and pizzas
separately.
b) Compute inter-quartile range, standard deviation, and coefficient of variation for
burgers and pizzas separately.
c) Based on the measures obtained in a) and b) what comments can you make about the
variation in fat of burger and pizza.
CHAPTER 2 Page 63
Chapter 2 - Multiple choice questions with Answers
1. One of the distinctions between a population parameter and a sample statistic is,
a) A population parameter is only based on conceptual measurements, but a sample
statistic is based on a combination of real and conceptual measurement.
b) A sample statistic changes each time you try to measure it, but a population
parameter remains fixed.
c) A population parameter changes each time you try to measure it, but a sample
statistic remains fixed.
d) The true value of a sample statistic can never be known but the true value of a
population parameter can be known.
Answer: (b)
Answer: (d)
Answer: (b)
4. Which of the following would be the MOST suitable for displaying the proportions of a city's
budget spent on different items?
a) Bar chart
b) Line graph
c) Pie chart
d) Histogra
Answer: (c)
5. Which of the following divides the distribution of data into four subgroups?
a) Quartiles
b) Percentiles
c) Standard deviation
d) Median
Answer: (a)
CHAPTER 2 Page 64
6. Which of the following is not a measure of central tendency?
a) Mean
b) Median
c) Standard deviation
d) Mode
Answer: (c)
Answer: (a)
8. Pulse rates of five randomly selected female undergraduates are: 70, 64, 80, 74, 92. What
is the median pulse rate?
a) 77
b) 76
c) 80
d) 74
Answer: (d)
9. Which of the following is not influenced by the occurrence of extreme values in a sample,
a) Mean
b) Median
c) Variance
d) Geometric mean
Answer: (b)
Chapter 2 – Summary
• The use of graphs, charts, and tables and the calculation of various statistical measures to
organise and summarise information is called descriptive statistics.
• The portion of the population selected for analysis is called the sample.
CHAPTER 2 Page 65
• A variable is a characteristic of interest concerning the individual elements of a
population or a sample.
• Four levels of measurement are usually recognised in data: Nominal, Ordinal, Interval
and Ratio. Numbers of Nominal data are names or labels. The numbers have all the
features of nominal measures and also represent the rank order (1st, 2nd, 3rd etc.) are
ordinal measures.
The numbers have all the features of ordinal measurement and also are separated by the
same interval. In this case, differences between arbitrary pairs of numbers can be
meaningfully compared. The numbers have all the features of interval measurement and
also have meaningful ratios between arbitrary pairs of numbers and ratio measures.
• Primary data are those which are collected for the first time, and they are original in
character. Secondary data are those which are already collected by someone for some
purpose and are available for the present study.
• A frequency distribution for qualitative data lists all categories and the number of
elements that belong to each of the categories.
• A bar graph is a graph composed of bars whose heights are the frequencies of the
different categories. A pie chart is also used to graphically display qualitative data. To
construct a pie chart, a circle is divided into portions that represent the relative
frequencies or percentages belonging to different categories.
• A histogram is a graph that displays the classes on the horizontal axis and the
frequencies of the classes on the vertical axis.
• A cumulative frequency distribution gives the total number of values that fall below
various class boundaries of a frequency distribution.
• An ogive is a graph in which a point is plotted above each class boundary at a height equal
to the cumulative frequency corresponding to that boundary.
• Summary measures are used to summarise the basic features of the data in a study.
CHAPTER 2 Page 66
• There is always a tendency of individual observations contained in any set of data to
cluster or centre around a specific value, which is the central value. This peculiar
characteristic of the data is referred to as central tendency.
• Dispersion measures:
Range: Difference between the largest and smallest value in the data
Inter-quartile Range: 𝑄3 − 𝑄1
∑𝑛
𝑖=1(𝑥𝑖 −𝑥̅ )
2 ∑𝑛 2
𝑖=1 𝑥𝑖 −𝑛𝑥̅
2
Sample variance: 𝑆 2 = =
𝑛−1 𝑛−1
∑𝑛
𝑖=1(𝑥𝑖 −𝑥̅ )
2 ∑𝑛 2
𝑖=1 𝑥𝑖 −𝑛𝑥̅
2
Sample standard deviation: 𝑆 = √𝑆 2 = √ =√
𝑛−1 𝑛−1
Learning Outcomes
• There is a 30% chance that this job will not be finished in time.
• There is likelihood that the business will make a profit next year.
• Nine times out of ten he arrives late for his appointments.
• There is no possibility of delivering the goods before Tuesday.
All the above are expressions indicating a degree of uncertainty. A very important branch of
mathematics called the theory of probability provides a numerical measure of uncertainty.
The probability describes certainty by1, impossibility by 0 and the various grades of
uncertainties by fractions or decimals in between 0 and 1.
For example, probability of drawing any number from 1 to 6 by throwing a die is 1. But the
CHAPTER 4 Page 68
probability that the number 7 appears from the die is 0. If the die is fair the probability of
getting number 4 (or any number from 1 to 6) is 1/6 which falls between 0 and 1.
Illustration 3.1
1. Throwing a die
2. Tossing of a coin
3. Inspection of an item to determine whether it is defective or non-defective
a) One toss of a coin results in the outcomes (H, T). If the coin is fair, then each outcome
is equallylikely.
b) Two tosses of a coin result in the outcomes (HH, HT, TH, TT). Again, if the coin is fair,
then eachoutcome is equally likely.
c) If a machine produces articles, some of which are defective, the outcomes are
(defective, or not defective). In this case the outcomes may not be equally likely.
Each possible outcome is called a sample point and the set of all possible outcomes is the
sample space, labelled as 𝑆. An event is a specific collection of sample points. For example,
when throwing an ordinary die, the sample space is S = {1,2,3,4,5,6}. Let 𝐸 be the event
"the number is odd", in this case,
E = {1,3,5}.
Suppose that an event 𝐴 occurs 𝑚 times in 𝑛 repetition of a random experiment. Then the
m m
ratio, n gives the relative frequency of the event 𝐴. The n will approach the probability of
occurrence of value event 𝐴 when 𝑛 becomes large.
In practice we can only try to have a close estimate of P(A) based on a large number of
observations. For practical convenience, the estimate of P(A) can be written as if it were
actually P(A) and the relative frequency definition of probability may be expressed as: 𝑃(𝐴)
m
= n The probability obtained by the above relative frequency definition is called a posterior
or empirical probability.
Illustration 3.2
If we flip the coin 100 times and observe heads 65 times, the relative frequency or
empirical probability of getting heads (A) is.
m 65
P(A) = = = 0.65
n 100
Example
A lawyer decides that the chance of winning the case of his customer in courts would be 75%
accordingto the currently available evidence.
CHAPTER 4 Page 70
Illustration 3.3
Calculate the probability of observing at least one head in a toss of two coins.
Solution
According to the sample space, there are three outcome events with at least one head. They
are,
E1, E2 and E3 .
Hence,
1 1 1 3
P(A) = P(E1 ) + P(E2 ) + P(E3 ) = + + =
4 4 4 4
The number P(Ei )is called the probability of event 𝐸𝑖 of the sample space if
i. 0 ≤ P(Ei ) ≤ 1
ii. ∑s P(Ei ) = 1
Thus, we require that a probability be greater than or equal to 0 and less than or equal to 1
and that the sum of the probabilities over the entire sample space, S, be equal to 1.
Let 𝐴 and 𝐵 be two events in a sample space. The union of 𝐴 and 𝐵 is defined to be the event
containing all sample points in 𝐴 or 𝐵 or both. We denote the union of 𝐴 and 𝐵 by the symbol
𝐴 𝖴 𝐵.The intersection of 𝐴 and 𝐵 is the event composed of all sample points that are in both
A and B and is denoted by𝐴 ∩ 𝐵. The union and the intersection of 𝐴 and 𝐵 are shown
diagrammatically as shaded areas in the following figures.
CHAPTER 4 Page 71
𝐴𝖴𝐵 𝐴∩𝐵
If 𝐴 and 𝐵 are mutually exclusive, P(A ∩ B) = 0. That is, if 𝐴 and 𝐵 are mutually exclusive,
they do not share common elements.
Therefore, P(A𝖴B) = P(A) + P(B).
Illustration 3.4
In a group of 20 adults, 4 out of 7 women and 2 out of 13 men wear glasses. What is the
probability that a person chosen at random from a group is a woman or someone who
wear glasses?
Let 𝐴 be the event "the person chosen is a woman" and 𝐵 be the event "the person
chosen wears glasses".
7 6 4
Then, P(A) = 20 P(B) = 20 P(A ∩ B) = 20
𝐏(𝐀𝗨𝐁) = 𝐏(𝐀) + 𝐏(𝐁) – 𝐏(𝐀 ∩ 𝐁)
𝟕 𝟔 𝟒 𝟗
= + - =
𝟐𝟎 𝟐𝟎 𝟐𝟎 𝟐𝟎
The complement of an event 𝐴 is the collection of all sample points of the sample space, 𝑆 and
not in 𝐴. The complement of 𝐴 is denoted by𝐴′, 𝐴𝑐, or 𝐴.
Since 𝐴 and 𝐴′ are mutually exclusive P(A ∩ A′ ) = 0 and exhaustive (that is, there are no
elementsoutside (𝐴 𝖴 𝐴′), and P (𝑆) = 1, P (𝐴) + P (𝐴′) = 1.
Illustration 3.5
A card is drawn from an ordinary pack of 52 playing cards. Find the probability that the
card is not a seven.
4
Let 𝐴 be the event, “the card is a seven”. Then, P(A) = 52
CHAPTER 4 Page 72
4
P(A) =
52
Therefore, the probability that the card is not a seven,
𝟒 𝟒𝟖
𝐏 (𝐀′ ) = 𝟏 − 𝐏 (𝐀) = 𝟏 – =
𝟓𝟐 𝟓𝟐
𝐴 and 𝐵 are two events. The probability of 𝐴, given that 𝐵 has already occurred is written P
(𝐴|𝐵) and is computed by,
P(A ∩ B)
P(A|B) =
P(B)
Similarly, the probability of 𝐵, given that 𝐴 has already occurred can be computed
by
P(A ∩ B)
P(B|A) =
P(A)
Illustration 3.6
Given that a heart is picked at random from a pack of 52 playing cards, find the probability
that it is a picture card.
Let 𝐴 be the event "a heart is drawn", 𝐵 be the event "a picture card is drawn".
13 3
Then, P(A) = 52 P(A ∩ B) = 52
Given K, Q and J are the picture cards.
Thus
P(A∩B) 3
P(B|A) = = 13
P(A)
Two events 𝐴 and 𝐵 are said to be independent, if and only if P(A|B) = P(A) or P(B|A) =
P(B). Otherwise, the events are said to be dependent.
CHAPTER 4 Page 73
3.8 The Multiplicative Law of Probability
Given two events 𝐴 and 𝐵, the probability of the intersection is P(A ∩ B) = P(A). P(B)
Illustration 3.7
A die is thrown twice. Find the probability of obtaining a 4 on the first throw and an odd
number on the second throw.
Let 𝐴 be the event "a 4 is obtained on the first throw" and 𝐵 be the event "an odd number
1 3 1
is obtained on the second throw”. Then, P(A) = 6 and P(B) = 6 = 2 . The occurrence of
the event A does not influence in any way the probability of the event 𝐵. Hence, the events
𝐴 and 𝐵 are independent.
1 1 1
P(A ∩ B) = P(A). P(B) = × = .
6 2 12
A tree diagram can be used to find the elements of the sample space. For example, if we toss
a coin, then the possible outcomes of the experiment are head and tail. This information can
be shown as follows.
Illustration 3.8
Three coins are tossed. What is the probability of obtaining 2 heads? Find it using a
treediagram.
n(S) = 8
n(A) 3
A = {HHT, HTH, THH} ∴ n(𝐴) = 3 P(A) = =
n(S) 8
Illustration 3.9
A bag contains 5 red marbles, 3 blue marbles and 2 green marbles. One marble is drawn
at random from the bag and replaced, and then a second marble is drawn from the bag.
Find theprobability of obtaining the following events using a tree diagram.
n(S) = 9
Since the first marble is replaced before the second is drawn, the events are independent.
5 3 2
Hence P (R) = 10 P(B) = 10 P (G) = 10 for any drawing.
5 5 25 1
i. P (R, R) = 10 × = 100 = 4
10
ii. P (Marbles are of different from colours) = 1 – P (Marbles are of same colour)
= 1- [P (R, R) + P (B, B) + P (G, G)]
5 5 3 3 2 2 62 31
= 1- (10 × 10) – (10 × 10) - (10 × 10) = 100 = 50
Suppose 𝐴1, 𝐴2, …, 𝐴𝑛 are mutually exclusive events of the sample space 𝑆 (that is 𝐴1 𝖴
𝐴2 𝖴 … 𝖴 𝐴𝑛 = 𝑆) and 𝐵 is an arbitrary event of 𝑆, then
P(B|Ai )P(Ai )
P(Ai |B) = n
∑i=1 P(B|Ai ). P(Ai )
This rule is called the Baye's rule for the probability. If n = 2, the Baye's reduces to the
expression,
P(B|Ai )P(Ai )
P(Ai |B) =
P(B|A1 ). P(A1 ) + P(B|A2 ). P(A2 )
where i = 1,2
Illustration 3.10
In recent years, much has been written about the possible link between cigarette smoking
and lung cancer. Suppose that in a large medical centre, of all the smokers who were
suspected of having lungcancer, 90% of them did, while only 5%of the non-smokers who
were also suspected of having lung cancer actually did. If the proportion (probability) of
smokers is 0.45, what is the probability that a lung cancer patient who is selected by
chance is a smoker?
Let 𝐴1 be the event "a patient is a smoker" 𝐴2 be the event "a patient is a nonsmoker" and
𝐵 be the event "a patient has lung cancer".
P(𝐴1) = 0.45 P(𝐴2) = 0.55 P(𝐵|𝐴1) = 0.9 P(𝐵|𝐴2) = 0.05
Illustration 3.11
Two companies (X and Y) produce RAM cards. An organization wishes to buy 150 RAM
cards and there is a 60% chance to place the order to company Y. RAM cards
manufactured by X and Y companies have 4% and 6% defects respectively. Given that a
defective RAM card has been found, what is the probability that it has come from Company
X?
Let 𝐴x be the event " selecting the RAM card from company X" 𝐴y be the event "selecting
the RAM card from company Y" and B be the event " RAM card is defective".
P(𝐴x) = 0.4 P(𝐴y) = 0.6 P(𝐵|𝐴x) = 0.04 P(𝐵|𝐴y) = 0.06
The numerical outcomes of experiments vary from one experiment to another and therefore
represent observations on a variable which we will denote by the symbol 𝑥. A variable X is a
random variable if the values (𝑥) that X assumes, corresponding to the various outcomes of
an experiment, are chance or random events. Random variables are classified as one of two
types: discrete or continuous. A discrete random variable is one that can assume a
countable number of values. Typical examples of discrete random variables are,
a) The number of defective bolts in a sample of ten drawn from industrial production.
b) The number of people in a waiting line for banking services.
A continuous random variable is one that can assume the infinitely large number of values
corresponding to the points on a line interval. Typical examples of continuous random
variables are,
a) The amount of gasoline produced per day in a refinery.
b) The waiting time before service at a supermarket counter.
The probability distribution for a discrete random variable is a formula, table or graph that
provides the probability associated with each value of the random variable. It is interesting
to note that the events cannot overlap because one and only one value of X is assigned to each
sample point; therefore, the values of X represent mutually exclusive numerical events. We
may therefore state two requirements for a probability distribution,
1. 0 ≤ p(x) ≤ 1
2. ∑all x p(x) = 1 for all x
where p(x) = P(X = x) is the probability that the random variable X takes the value x.
Illustration 3.12
Consider an experiment that consists of tossing two coins and let X be the number of
heads observed. Find the tabulated probability distribution for X.
The sample point 𝐸1 is associated with the sample event "observe a head on coin 1 and a
head on coin 2", then the value of X is 2, similarly, for the event 𝐸2, the value of X is 1, etc.
The sample space of X is, X = {0,1,2}
CHAPTER 4 Page 78
The values of X with respective probabilities are given in the following table.
X p(x) = P(X = x)
0 1⁄
4
1 2⁄
4
2 1⁄
4
Observe that the probability distribution of X in the above table can be represented
graphically as follows.
If you were to draw a sample from this population that is, if you were to throw two balanced
coins, say 100 times, and each time record the number of heads observed, then construct a
histogram using the 100 measurements. You would find the histogram of your sample would
appear very similar to that for p(x).
Illustration 3.13
p(0) = P(X = 0) = 0
p(1) = P(X = 1) = c
p(2) = P(X = 2) = 4c
p(3) = P(X = 3) = 9c
p(4) = P(X = 4) = 16c.
We know that ∑all x p(x) = 1
Thus, 0 + c + 4c + 9c + 16c = 1
1
30c = 1, c = 30
1
Therefore, the probability distribution function is p(x) = x 2 for x = 0,1,2,3,4.
30
CHAPTER 4 Page 79
X p(x) = P(X = x)
0 0
1 1⁄
30
2 4⁄
30
3 9⁄
30
4 16⁄
30
We need to take a different approach to find the probability distribution for a continuous
random variable. The relative frequency associated with a particular class in the population
is the fraction of measurements in the population falling in that interval; it is also the
probability of drawing a measurement in that class. If the total area under the relative
frequency histogram were adjusted to be equal to 1, then the area under the frequency curve
would correspond to probabilities. Assume that the random variable, X, may take values on
a real line. The density of probability (but not the value of the probability), which varies with
the values of X may be represented by a mathematical expression fX (x) or f(x) , called the
probability density function (pdf) for X. The probability density function f(x) is defined so that
the total area under the curve is equal to 1 and therefore, the area lying above a given interval
of values will be equal to the probability that f(x) will fall in that interval. Therefore, the
probability that a < X < b is equal to the shaded area under the density function, f(x)
between the two points 𝑎 and 𝑏.
CHAPTER 4 Page 80
Since the area lying over any particular point, say X = a, is 0, it follows that P(X = a) = 0.
This means that P(X ≤ a) = P(X < a) and P(X ≥ a) = P(X > a). This is, of course, not true
for a discrete random variable.
Because of the symmetry, we can simplify our areas by means of the transformation,
X−μ
Z= σ
CHAPTER 4 Page 81
Now, if the random variable X has a normal distribution with mean 𝜇 and standard deviation
𝜎, then the random variable Z has a standard normal distribution with mean 0 and standard
deviation 1. The area under the normal curve between the mean, Z = 0 and a specified value
of (Z > 0) , say 𝑧0, is the probability, P(0 ≤ Z ≤ z0 ). This area is recorded in Appendix A and
is shown as the shaded area in thefollowing figure.
Illustration 3.14
This probability corresponds to the area between the mean (Z = 0) and a point. Z = 1.63
standard deviations to the right of the mean. This area is shaded in the following figure.
Since the standard normal table in the Appendix A gives area under the normal curve to
the right of the mean, we need only find the tabulated value corresponding to z = 1.63.
Go down the left-hand column of the table to the row corresponding to z = 1.6. and across
the top of the table to the column marked 03. The intersection for this row and column
combination gives this area to be equal to 0.4484.Therefore P(0 ≤ Z ≤ 1.63) = 0.4484.
CHAPTER 4 Page 82
Illustration 3.15
This probability corresponds to the area between z = −0.5 and z = 1.0 as shown in the
following figure.
The area required is equal to the sum of 𝐴1 and 𝐴2 as shown in the above figure. From
Appendix A, we read 𝐴2= 0.3413. The area 𝐴1 would equal the corresponding area
between Z = 0 and Z = 0.5. That is, 𝐴1= 0.1915. Thus,
Illustration 3.16
Studies show that gasoline usage for compact cars sold is normally distributed with a
mean usage of 30.5 miles per gallon (mpg) and a standard deviation of 4.5 mpg. What
percentage of compacts obtain 35 or more miles per gallon?
First, we must find the 𝑧 value corresponding to x = 35 mpg . Substituting into the formula
for Z, we obtain,
X−μ 35 −30.5
Z= = = 1. The required area is given by the shaded area in the following figure.
σ 4.5
The area to the right of the mean corresponding to 𝑧 = 1.0 is 0.3413. The entire area to the
right of the mean is 0.5. Thus,
P(X > 35) = P(Z > 1.0) = 0.5 − 0.3413 = 0.1587.
Thus, the percentage exceeding 35 mpg is 15.87%.
CHAPTER 4 Page 83
Illustration 3.17
In the previous example, if the manufacturer wants to develop a compact car which
outperforms 95% of the current compacts in fuel economy, what must be the gasoline
usagefor the new car?
Let X be a normally distributed random variable with mean equal to 30.5 and standard
deviation equalto 4.5. We want to find the value x0 such that P(X < x0 ) = 0.95.
x0 − μ x0 − 30.5
P(X < x0 ) = P (Z < ) = P (Z < ) = 0.95
σ 4.5
x0 −30.5
Let = z0 such that P(Z < z0 ) = 0.95
4.5
Therefore, z0 = 1.645 which covers an area of 0.45 to the right side of the mean in Table
A.
x0 −30.5
Then = 1.645, x0 = 37.9.
4.5
The probability distribution provides a model for the theoretical frequency distribution of a
random variable and hence must possess a mean, variance, standard deviation, and other
descriptive measuresassociated with the theoretical population that it represents.
CHAPTER 4 Page 84
We shall confine our attention to the problem of calculating the mean value of a random
variable defined over a theoretical population. This mean (𝜇) is called the expected
value, E(X), of the random variable X. Let X be a discrete random variable with probability
distribution p(x) and let E(X) represent the expected value of X. Then,
Illustration 3.18
Consider the random variable X representing the number of heads on the toss of two coins.
Find the expected value of X .
𝒙 p(x) x. p(x)
0 1⁄ 0
4
1 2⁄ 2⁄
4 4
2 1⁄ 2⁄
4 4
2 2
μ = E(X) = ∑ x ∙ p(x) = 0 + + =1
4 4
all x
Hence, if you repeat the experiment an infinite number of times, the average number of
heads on tossing the coin twice, would be one.
Illustration 3.19
Eight thousand tickets are to be sold at Rs. 20.00 each in a lottery conducted to benefit the
local school. The prize is a Rs. 52,000.00 valued mobile phone. If you purchase two tickets,
what is your expected gain?
Here, either you will lose Rs. 40.00 or will win Rs. 51,960.00 (= 52,000 - 40) with
probabilities 7998/8000 and 2/8000 respectively. The expected gain will be,
7998 2
E(x) = (−40) 8000 + (51960) 8000 = -Rs.27
3.16 Variance
Let X be a discrete random variable with probability distribution p(x); then the variance, 𝜎2
of X is,
σ2 = E(X − μ)2 = ∑(x − μ)2 . p(x) = ∑ x 2 . p(x)
all x all x
CHAPTER 4 Page 85
Illustration 3.20
Let X be a random variable with probability distribution given by the following table. Find
𝜇, 𝜎2, and 𝜎 of X.
𝑥 -1 0 1 2 3 4 5
p(x) 0.05 0.1 0.4 0.2 0.1 0.1 0.05
x. p(x) -0.05 0 0.4 0.4 0.3 0.4 0.25
x 2 . p(x) 0.05 0 0.4 0.8 0.9 1.6 1.25
The need for adequate and reliable data is ever increasing for taking policy decisions in
different fieldsof human activity. There are two ways in which the required information may
be obtained: Census and Sampling.
a) Census
Under complete enumeration survey method, data are collected for each and every unit
(person, household, field, shop, factory etc.) of the population or universe, which is the
complete set of items(sampling frame), which are of interest in any particular situation. The
advantage of this type of survey will be that no unit is left out and hence greater accuracy may
be ensured. However, the effort, money and time required for carrying out complete
enumeration will generally be extremely large and in many cases cost may be so prohibitive
that the very idea of collecting information may be dropped. Hence, in modern times very
little use is made of complete enumeration survey.
Example
The government of Sri Lanka conducts a census once in ten years where every household is
surveyed.
CHAPTER 4 Page 86
b) Sampling
Very often our attitudes, our knowledge and our actions are based on samples. It applies
equally to everyday life and to scientific research. A person's opinion of a bank, or a shop, or
an institution is generally based on one or two encounters, which he had with it in the course
of several years of working with the former. A visitor's opinion about a country after
spending a few days in it will be determined by his experiences of a few places he has seen
and a few persons he has met. Perhaps our visitor is less likely to be aware of the extent of
his ignorance. Generally, a vaguely formulated understanding of sampling is part of what is
called common sense and is characteristic of everyday approach.
Examples
i. A trader examining a handful of grains from the bag.
All of them are employing the method of sampling. Their confidence in their judgment’s rests
on thefact that the material they are sampling is so well mixed or homogeneous that the few
grains of wheat, a drop of blood, a few leaves of tea or a few grains of rice do adequately
represent the whole.
To test the quality of light bulbs, manufacturers test the life time of bulbs by lightening them
until a bulb is burned out. To ensure that the production meets the minimum standard, a
relatively small sample of bulbs is selected.
The data can be collected and summarized more quickly with a sample than with a complete
count. This is as a vital consideration when the information is urgently needed.
• Greater Scope
In certain types of inquiry highly trained personnel or specialized equipment limited in
availability, must be used to obtain the data. A complete census is impracticable.
CHAPTER 4 Page 87
The choice lies between obtaining the information by sampling or not at all. Thus, surveys
that rely on sampling have more scope and flexibility regarding the types of information
that can be obtained.
• Greater Accuracy
In sampling, high-quality personnel can be employed and intensive training can be given to
them. Careful supervision of the fieldwork can also be done. Accurate processing of results
is also possible when the volume of work is reduced. Thus, a sample may produce more
accurate results than a census.
The error arising due to drawing inferences about the population on the basis of few
observations (sampling) is termed sampling error. Clearly the sampling error in this sense
is non-existent in a complete enumeration survey since the whole population is surveyed.
However, the errors mainly arising at the stages of ascertainment and processing of data,
which are termed non-sampling errors, are common both in complete enumeration and
sample surveys.
These sources are not exhaustive, but are given to, indicate some of the possible sources of
error. In a sample survey, non-sampling errors may also arise due to a defective frame and
faulty selection of sampling units.
There are many methods of sampling. The choice of method will be determined by the
purpose of sampling. The various methods can be categorized under two groups.
• Relative Importance
• Relative Variability
• Unit sampling cost
• Relative size
When first three factors are equal among each stratum, the method of proportional allocation
is used to select a sample.
c) Systematic Sampling
When a systematic sample is selected, it is also known as quasi-random sampling. When a
complete list of the population is available, this method is used. We arrange the items in
numerical, alphabetical, and geographical or any other order. If we want to select a sample
of 10 students from 120 students, under this method kth item is picked up from the sample
frame and 𝑘 is the sample interval.
In the above example, 𝑘 = 120/10 = 12. 12 is the sampling interval. First student will be
selected at randomly from the first 12 names and every 12th student will be taken as sample,
i.e., if first student will be number 2, then (2 + 12)th , (14 + 12)th, (26 + 12)th, and so on.
Since it seems uneconomical to measure all the sampling units in the selected clusters,
instead of enumerating all the units, one can obtain a sample of units by resorting to sub
sampling within any chosen cluster. This technique is called two-stage sampling, clusters
being termed as primary units and the units within clusters as secondary units. Multi-stage
sampling is the generalization of the above technique.
CHAPTER 4 Page 90
Example
It is decided to take a sample of 5,000 households from the Western province.
First stage:
The province may be divided into a number of districts and a few districts selected at
random.
Second stage:
Each district may be sub-divided into Grama Niladhari (GN) divisions and a sample of GN
divisions may be taken at random.
Third stage:
A number of households may be selected from each of the GN divisions selected at the second
stage. In this way, at each stage the sample size becomes smaller and smaller.
a) Judgment Sampling
In judgment sampling the choice of sample items depends exclusively on the discretion of
the investigator. In other words, the investigator exercises his judgment in the choice and
includes those items in the sample, which he thinks are most typical of the population with
regard to the characteristics under investigation.
b) Convenience Sampling
The method of convenience sampling is also called the chunk. A chunk refers to that fraction
of the population being investigated which is selected neither by probability nor judgment
but by convenience. A sample obtained from readily available lists such as automobile
registration, telephone directories, etc. is a convenience sample and not a random sample
even if the sample is drawn at random from the lists.
c) Quota Sampling
Quota sampling is a non-probability sampling technique wherein the assembled sample has
the same proportions of individuals as the entire population with respect to known
characteristics, traits or focused phenomenon. The researcher must make sure that the
composition of the final sample to be used in the study meets the research's quota criteria.
The main reason why researchers choose quota samples is that it allows the researchers to
sample a subgroup that is of great interest to the study.
CHAPTER 4 Page 91
Chapter 3 - Reinforcement Problems
1) Consider the following experiment involving two urns. Urn A contains 2 white balls and 1
blackball. Urn B contains 1 white ball. A ball is drawn from urn A and placed in urn B.
Then a ball is drawn from urn B. What is the probability that the ball drawn from urn B
will be white?
2) A coin is tossed, and a die is thrown. What is the probability of obtaining a head on the
coin and an even number on the die?
3) The two events 𝐴 and 𝐵 are such that P(𝐴) = 0.6, P(𝐵) = 0.2, P(𝐴|𝐵) = 0.1. Calculate the
probabilities that,
a) Both of the events occur.
b) At least one of the events occurs.
c) 𝐵 occurs, given that 𝐴 has occurred.
4) Two coins are tossed. Using a tree diagram find the probability of obtaining:
a) 2 tails
b) Exactly one head
c) At least one head
5) Two coins are tossed. One coin is unbiased, and the other is biased so that a head is twice
as likely as a tail. Find the probability of obtaining,
a) Two heads
b) A head and a tail
c) Two tails, using a tree diagram.
6) A box contains 5 white, 4 blue, and 5 red marbles. One marble is drawn at random and
returned to the box; a second marble is then drawn. Using a tree diagram find the
probabilityof drawing a blue and a red marble in that order.
7) A has a probability of ¾ of winning a set against B. A match is won by the player who first
wins two sets. Use a tree diagram to find the probability that A wins the match.
8) As items come to the end of a production line, an inspector chooses which items are to go
through a complete inspection. 10% of all items produced are defective. 60% of all
defective items go through a complete inspection, and 20% of all good items go through
a complete inspection. Given that an item is completely inspected, what is the probability
it is defective?
9) The random variable 𝑋 is normally distributed with mean 300 and standard deviation 5,
find,
a) 𝑃 (𝑋 > 294) b) 𝑃 (𝑋 < 302) c) 𝑃 (𝑋 < 312)
CHAPTER 4 Page 92
10) The random variable 𝑋 is normally distributed with mean 45 and standard
deviation 4, find thevalue of 𝑥0 if,
a) (𝑋 < 𝑥0) = 0.0317 b) (𝑋 < 𝑥0) = 0.8950
11) Packages from a packing machine have a mass which is normally distributed
with mean 200 g and standard deviation 2 g. Find the probability that a package from the
machine weighs,
a) Less than 197 g b) More than 200.5 g c) Between 119.5 g and 198.5 g
12) The monthly demand for product A is normally distributed with mean 200 units
and standard deviation 40 units. The demand for another product, B, is also normally
distributed with mean 500 units and standard deviation 80 units. If a seller of these
products stocks 280 units of A and 650 units of B at the beginning of a month, what is the
probability that the seller will experience a stock out for both products during the month?
13) If the sampling frame available from the registrar’s files is a listing of the names
of all 4000 registered full-time students compiled from eight separate alphabetical lists,
based on the gender and academic year breakdowns, what type of sample should you
take? Discuss.
14) Suppose that each of the 5000 registered full-time students lived in one of the
10 university hostels. Each hostel accommodates 400 students. It is university policy to
fully integrate students by gender and academic year in each hostel. If the registrar is able
to compile a listingof all students by hostel, explain how you could take a cluster sample.
Answer: (c)
2) The probability that the Kelani River will flood in any given year has been estimated from
200 years of historical data to be one in four. This means:
(a) The Kelani River will flood every four years.
(b) In the next 100 years, the Kelani River will flood exactly 25 times.
(c) In the last 100 years, the Kelani River flooded exactly 25 times.
(d) In the next 100 years, the Kelani River will flood about 25 times.
Answer: (d)
CHAPTER 4 Page 93
3) The joint probability of events 𝐴 and 𝐵 is 32 percent with the probability of event 𝐴 being
60 percent and the probability of event 𝐵 being 50 percent. Based on this information,
the conditional probability of event A given event B has occurred is closest to:
a) 0.30 b) 0.64 c) 0.53 d) 0.46
Answer: (b)
4) A surprise quiz contains three multiple choice questions; question 1 has 3 suggested
answers,question 2 has four, and question 3 has two. A completely unprepared student
decides to choose the answers at random. If 𝑋 is the number of questions the student
answers correctly,find the probability of at least one correct answer.
a) 0.75 b) 0.87 c) 0.65 d) 0.43
Answer: (a)
5) A physical fitness association is including the mile run in its secondary- school fitness test.
The time for this event for boys in secondary school is known to possess a normal
distribution with a mean of 450 seconds and a standard deviation of 50 seconds. The
fitness association wants to recognize the fastest 10% of the boys with certificates of
recognition. What time would the boys need to beat in order to earn a certificate of
recognition from the fitness association? (Correct to 2 decimal places)
a) 532.25 sec b) 367.75 sec c) 385.92 sec d) 514 sec
Answer: (c)
7) Consider the random variable X representing the number of heads on the toss of two
coins.Find the expected value of X .
𝒙 p(x)
0 0.05
1 0.25
CHAPTER 4 Page 94
2 0.50
3 0.20
Answer: (d)
8) The variance of X is
a) 0.6725 b) 0.7921 c) 0.6275 d) 0.7291
Answer: (c)
Chapter 3 – Summary
The set of all possible outcomes of a random experiment is called the sample space. The
elements of sample space are called sample points.
An Event is any collection of outcomes of an experiment: that is, an event is any subset of the
samplespace.
The intersection of two events 𝐴 and 𝐵 results in another event containing all such elements
that are common to 𝐴 and 𝐵.
The union of two events 𝐴 and 𝐵 results in another event containing all elements, which
belong to either 𝐴 or 𝐵.
Compliment of an event 𝐴 denoted by A′ , the event containing all the elements of 𝑆 that
are notelements of 𝐴.
Events are said to be mutually exclusive or disjoint if two or more of them cannot occur
simultaneously in a single trial of an experiment.
Let 𝐴 and 𝐵 are two events. The probability of 𝑨, given that 𝑩 has already occurred is written
P(A|B)
P(A ∩ B)
and P(A|B) = P(B)
The Additive Law of Probability for two events: 𝐴 and 𝐵 is as P(A 𝖴 B) = P + P(B) −
CHAPTER 4 Page 95
P(A ∩ B).
The Multiplicative Law of Probability for two events A and B, the probability of the
intersection,is P(A ∩ B) = P. = P. P(A).
If events 𝐴 and 𝐵 are independent, P(A ∩ B) = P(A). P(B)
A variable X is a random variable if the values that X assumes, corresponding to the various
outcomes of an experiment, are chance or random events.
A discrete random variable is one that can assume a countable number of values.
A continuous random variable is one that can assume the infinitely large number of
values corresponding to the points on a line interval.
The probability distribution for a discrete random variable is a formula, table or graph that
providesthe probability associated with each value of the random variable.
Census is a complete enumeration survey method, data are collected for each and every unit
(person,household, field, shop, factory etc.) of the population.
In most of the cases, sampling is done rather than studying the entire population.
The error arising due to drawing inferences about the population on the basis of few
observations (sampling) is termed sampling error.
The error mainly arising at the stages of ascertainment and processing of data, which are
termed non- sampling errors, are common both in complete enumeration and sample
surveys.
The various sampling methods can be categorized under two groups: probability sampling
and non-probability sampling.
In probability sampling, a sample is selected in such a way that each item or person in the
population being studied has a known probability of being included in the sample. Simple
Random Sampling, Stratified Random Sampling, Systematic Sampling and Cluster Sampling
are the probability sampling methods discussed in this chapter.
Learning Outcomes
4.0 Introduction
Managerial decisions often rely on the analysis of the relationship between multiple
variables. For instance, the marketing manager may examine the relationship between
advertising expenditure and sales in order to forecast sales for a specific advertising budget.
Similarly, a public utility may utilise the relationship between the daily temperature and
electricity demand to predict the future electricity usage on anticipated temperature for the
upcoming month. Sometimes, managers will rely on intuition to judge how two variables are
related.
This is the variable that remains unaffected by changes in other variables. In an experimental
study, the independent variable is the variable that researchers manipulate, control, or vary
to investigate its effects, is referred as “independent” because it is not influenced by any
other variables within the study. Independent variables are also known as explanatory
CHAPTER 4 Page 97
variables, predictor variables, input variables, regressors or exogenous variables.
A dependent variable is the variable that changes in response to the manipulation of the
independent variable. It represents the outcome of interest that you wish to measure, and
its changes are influenced by the independent variable.
Dependent variables are also known as response variable, outcome variable, target variable
and endogenous variable. To illustrate, let’s consider the variables of advertising
expenditure and total revenue for a specific company. In this case, the sales revenue is
influenced by the amount spent on advertising. In other words, higher advertising
expenditure leads to higher sales revenue. Hence, advertising expenditure is the
independent variable, while sales revenue is the dependent variable.
In a scatter plot, if the points are closely clustered around a straight line, it suggests a strong
linear relationship between two variables.
CHAPTER 4 Page 98
The data for the output at a factory each week for the last ten weeks and the
corresponding production cost of the output are given below.
• The production cost depends on the volume of output. Therefore, the volume of output
is the independent variable plotted on the x-axis, while the production cost is the
dependent variable plotted on the y-axis.
• Upon observing the graph, it becomes apparent that the scattered data points roughly
align along an upward-treading line. This suggests that as the output volume increases,
the production cost also tends to rise.
Note: A scatter diagram can reveal a range of different types and degrees of correlation.
4.2 Correlation
Correlation is a bivariate analysis that quantifies the strength of linear relationship between
two variables and the direction of the relationship. It assesses how changes in one variable
are associated with changes in another variable. In this chapter, we discuss two types of
correlation.
The distinction between linear and non-linear relationships depends on the nature of the
relationship between two variables. Linear relationship refers to a situation where the
relationship between two variables can be adequately represented by a straight line. In this
case, as one variable increases, the other variable changes at a constant rate, resulting in a
constant ratio of change. Linear relationship is captured by the Pearson’s correlation
coefficient.
CHAPTER 4 Page 100
Non-linear relationship, also known as curvilinear relationship, describes a relationship
between two variables that cannot be represented by a straight line. In such cases, the rate
of changes of the variables is not constant, leading to varying ratios of change. The
relationship may follow a curved pattern, such as quadratic, exponential, logarithmic, or
other non-linear functions.
The line of best fit can be determined through regression analysis, a topic that will be
discussed in detail later in this chapter.
The following formula is used to calculate the Pearson’s Correlation Coefficient (r).
CHAPTER 4 Page 101
𝑛∑𝑥𝑦 − ∑𝑥∑𝑦
𝑟=
√(𝑛∑𝑥 2 − (∑𝑥)2 )x (𝑛∑𝑦 2 − (∑𝑦)2 )
Where,
r = Pearson’s correlation coefficient
n = Number of the paired observations
∑ y = Sum of the y values
∑ x =Sum of the x variable
∑ x2 =Sum of the x squared values
∑ y2 =Sum of the y squared values
∑ xy = Sum of the product of x and y
The correlation coefficient, 𝑟, varies between +1 and -1. A value of +1 indicates perfect
positive linear relationship between two variables, while a value of -1 indicates perfect
negative linear relationship. As the value 𝑟 approaches 0, the strength of the linear
relationship between the variables weakens.
The sign of 𝑟 indicates the direction of the relationship. A positive sign (+) suggests a
positive relationship, meaning that as one variable increases, another variable tend to
increase as well. Conversely, a negative sign (-), indicates a negative relationship, where an
increase in one variable is associated with a decrease in the other variable.
Types of correlation
pPerfect positive Strong positive correlation Moderate positive
correlation correlation
No Correaltion
CHAPTER 4 Page 103
Evaluate Pearson’s correlation coefficient for the data on sales and advertising spend
in the table below.
Solution:
With calculations involving summations, we facilitate the calculations by setting out
in columns.
x y x2 y2 xy
1.3 151.6 1.69 22,982.56 197.08
0.9 100.1 0.81 10,020.01 90.09
1.8 199.3 3.24 39,720.49 358.74
2.1 221.2 4.41 48,929.44 464.52
1.5 170.0 2.25 28,900.00 255.00
7.6 842.2 12.40 150,552.50 1,365.43
𝑛∑𝑥𝑦− ∑𝑥∑𝑦
r=
√(𝑛∑𝑥2−(∑𝑥)2)x(𝑛∑𝑦2−(∑𝑦)2)
(5𝑥1,365..43)−(7.6𝑥842.2)
r=
√[(5𝑥12.4)−7.62)x(5𝑥150,552.5−842.22)]
r = 0.993
Interpretation:
The Pearson’s correlation coefficient of 0.996 indicates a very strong positive linear
relationship between advertising expenditure and sales.
CHAPTER 4 Page 104
x 39 43 21 64 57 47 28 75 34 52
y 65 78 52 82 92 89 73 98 56 75
x y x2 y2 xy
39 65 1,521 4,225 2,535
43 78 1,849 6,084 3,354
21 52 441 2,704 1,092
64 82 4,096 6,724 5,248
57 92 3,249 8,464 5,244
47 89 2,209 7,921 4,185
28 73 784 5,329 2,044
75 98 5,625 9,604 7,350
34 56 1,156 3,136 1,904
52 75 2,704 5,625 3,900
𝑛∑𝑥𝑦− ∑𝑥∑𝑦
r=
√[𝑛 ∑ 𝑥 2 −(∑ 𝑥)2 ][𝑛 ∑ 𝑦 2 −(∑ 𝑦)2 ]
r = 0.84
The correlation value 0.84 indicates that the linear relationship between 𝑥 and 𝑦 is strong
and positive.
A company wants to identify the relationship between the number sold of a new car
model and the amount of money spent on advertising. The information on the cost
incurred on advertising (x) (in Rs. million) and the number of cars sold (y) for 15
years are summarised below:
∑x = 177 ∑y = 679 ∑x2 = 2,576 ∑ y2 = 39,771 ∑xy = 9,915 n = 15
Based on the above data, The correlation coefficient between (x) and (y) is:
1) 0.91
2) -0.91
3) 0.19
4) -0.19
CHAPTER 4 Page 105
The following table shows the advertising expenses and sales value for last 6 months
of a company:
Advertising Expenses
(Rs’000) (x) 44 29 74 12 9 50
Sales Value (Rs.’000)
(y) 550 480 630 230 240 610
Spearman’s rank correlation coefficient (Spearman’s rho) is the most common alternative to
Pearson’s correlation coefficient. It is used when the variables are measured on an ordinal
CHAPTER 4 Page 106
or ranked scale. It is a rank correlation coefficient because it uses the rankings of data from
each variable (e.g., from lowest to highest) rather than the raw data itself.
Spearman’s rank correlation coefficient is particularly useful when dealing with variables do
not have a linear relationship or when the data is not normally distributed. It is robust to
outliers. While the Pearson correlation coefficient measures the linearity of relationships,
the Spearman correlation coefficient measures the monotonic relationship between the
variables.
6∑d2
R=1−
𝑛(n2 −1)
Where,
n = number of pairs of data
d = the difference between the ranks in each set of data
Required
Determine whether the placement of the student in Test 01 correlates with their
placement in Test 02.
Solution
Correlation must be measured by Spearman’s rank coefficient because we are given
the placement of students rather than their actual marks.
CHAPTER 4 Page 107
6∑d2
R = 1 − 𝑛(n2 −1)
Where, d is the difference between the rank in Test 01 and the rank in Test 02 for
each student.
A 2 1 1 1
B 1 3 -2 4
C 4 7 -3 9
D 6 5 1 1
E 5 6 -1 1
F 3 2 1 1
G 7 4 3 9
26
6∑d2
R = 1 − 𝑛(n2 −1)
6𝑋26
R = 1 − 7(49−1)
156
R = 1 − 336 = 0.536
If in a question some of the items, tie for a particular ranking these must be given an average
place before the coefficient of rank correlation is calculated.
CHAPTER 4 Page 108
Here is an example.
In a reality music show, five artists were placed in order of merit by two different
judges as follows.
Required
Assess how the two sets of rankings are correlated.
Solution
Artist Rank – Adam Rank – Smith d d2
6𝑥28.5
R = 1 − 5(25−1) = -0.425
• Correlation does not imply causation. Even if two variables have strong correlation, it does
not necessarily mean that one variable directly causes the other to change.
Wine A B C D E F G H
Taster X 3 7 1 8 5 2 4 6
Taster Y 3 8 2 7 4 1 5 6
Find Spearman’s rank correlation coefficient for this data, giving your answer to
three decimal places.
Determine the ranks of the following data, with the smallest being ranked 1.
Values 3.21 3.49 3.99 4.05 3.49 4.49 4.99 4.05 3.49
Regression is a statistical method used in various fields, including finance, and investing, to
analyse and understand the relationship between a dependent variable (Y) and a set of
independent variables (X’s). The primary goal of regression analysis is to determine the
strength, direction, and the nature of the relationship between the variables involved. For
an example, a manufacturer aims to predict the sales revenue of refrigerators (Y) by
analysing the key factors such as advertising expenditure (X1), average customer rating
(X2), price of the product (X3), and competitors’ advertising expenditure (X4). When a
single independent variable, X, is used to predict the dependent variable, it is referred to as
simple linear regression. This section discusses simple linear regression.
Here are a few examples where simple linear regression is used to establish a linear
relationship between an independent variable and a dependent variable.
1. Predicting the sales for a building materials store based on the size of the store.
2. Predicting the monthly rent of an apartment based on its size.
3. Predicting the monthly sales of a product in a supermarket based on the amount of shelf
space devoted to the product.
CHAPTER 4 Page 110
The equation of a straight line is y = β0 + β1x where β0 is the y intercept and β1, the slope of
the line. The linear model y = β0 + β1x is said to be a deterministic mathematical model
because when a value of x is substituted into the equation, the value of y is determined, and
no allowance is made for error. The simple linear regression model is given by,
y = β0 + β1x + ε
Where ε is assumed to be a random error variable with expected value equal to zero and
variance equal to σ2. Furthermore, we assume that the distribution of errors about the line
will be identically the same, regardless of the value of x, and that any error terms are
independent of one another.
In general, we will assume, that we are given 𝑛 pairs of observations regarding two variables,
independent and dependent, (𝑥𝑖 , 𝑦𝑖 ), 𝑖 = 1,2, … , 𝑛. Data values 𝑥𝑖 are called observations on
the independent variable and the data values 𝑦𝑖 are called observations on the dependent
variable.
The simple linear regression model for the data values (𝑥𝑖 , 𝑦𝑖 ), 𝑖 = 1,2, … , 𝑛, is,
𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝜀𝑖 , 𝑖 = 1,2, … , 𝑛.
For notational convenience, suppressing 𝑖, we write the above model as, y= β0 + β1x + ε.
The statistical procedure for finding the "best fitting" straight line for a set of points would
seem a formalisation of the procedure used to fit a line by the eye. If we denote the predicted
value of y obtained from the fitted line as 𝑦̂, the prediction equation will be
𝑦̂ = 𝑎 + 𝑏𝑥
The most common approach to finding 𝑎 and 𝑏 is using the least-squares method. This
method minimises the sum of the squared differences between the actual values, y, and the
predicted values 𝑦̂. This sum of squared differences is equal to ∑(𝑦 − 𝑦̂)2 = ∑(𝑦 − 𝑎 − 𝑏𝑥)2.
The least squares method determines the values of a and b that minimises the
Σ(𝑦 − 𝑎 − 𝑏𝑥)2 around the prediction line. The estimates a and b can be obtained using the
following mathematical formulas.
𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦 ∑ 𝑥𝑦 − 𝑛𝑥̅ 𝑦̅
𝑏= = and
𝑛 ∑ 𝑥 2 − (∑ 𝑥)2 ∑ 𝑥 2 − 𝑛𝑥̅ 2
𝑎 = 𝑦̅ − 𝑏𝑥̅
CHAPTER 4 Page 111
Measures of Variation
When using the least-squares method to determine the regression coefficients for a set of
data, you need to compute three measures of variation. The first measure, the total sum of
squares (SST), is a measure of variation of the values, y around their mean 𝑦̅. The total
variation, or total sum of squares, (SST) is subdivided into explained variation and
unexplained variation.
The explained variation, or regression sum of squares (SSR), represents variation that is
explained by the relationship between x and y, and the unexplained variation, or error sum of
squares (SSE), represents variation due to factorsother than the relationship between x and
y.
The total sum of squares is equal to the regression sum of squares plus the error sum of
squares as follows:
𝑆𝑆𝑇 = 𝑆𝑆𝑅 + 𝑆𝑆𝐸,
Where,
𝑅 2 varies in between 0 and 1. The higher the value of 𝑅 2 the higher the explanatory power
of the regression model.
CHAPTER 4 Page 112
For example, an 𝑅 2 value of 0.8 means that 80% of the variation in the dependent variable is
explained by the independent variable in the model, while the remaining 20% is attributed
to other factors or random error.
A company has the following data on its sales during the last year in each of its
regions and the corresponding number of salespersons employed during this time.
Obtain a simple linear regression model to forecast sales for each region.
The dependent variable, y, is the sales and the independent variable, x, is the number
of salespersons.
Thus,
(6×19,736)−(79×1,466) 2,602
𝑏= (6×1,083)−792
= = 10.12
257
79
𝑥̅ = = 13.17
6
1,466
𝑦̅ = = 244.33
6
Hence,
a = 244.33 – (10.12 x 13.17) = 111.05
Interpretation of a and b
The b value of 10.12 tells us that each extra salesperson generates an extra 10.12
sales (on average), while the a-value of 111.05 means that 111.05 units will be sold
if no sales people are used.
Based on the fitted regression model, forecast the average number of sales that
would be expected next year in a region that employed
(a) 14 salespersons
(b) 25 salespersons
Solution
(a) The regression line is y = 111.05 + 10.12x. the y when 𝑥 = 14 is,
𝑦 = 111.05 + 10.12 × 14 = 252.73
Rounding this into whole number, it is expected that, on average, 253 units will
be sold in a region employing 14 salespersons.
(b) Similarly, when 𝑥 = 25, we have,
𝑦 = 111.05 + 10.12 × 25 = 364.05
When there are 25 salespersons, it is expected that, on average, there will be
sales of 364 units in a region.
12.1 0.61
14.1 0.63
14.6 0.70
15.1 0.70
15.2 0.75
Fit a simple linear regression model for the above data. Based on the fitted model,
forecast the profit for next year if an advertising budget of Rs. 800,000 is allocated.
CHAPTER 4 Page 114
Test yourself – 4.8
The following data describes fuel consumption and flying hours for single-engine,
piston-driven general aviation aircraft from 2000 through 2006. Fuel consumption is
in millions of gallons, flying time is in millions of hours.
(a) Determine the regression line for predicting fuel consumption on the basis of flying
time.
(b) If there were 24.0 million flying hours during a given year what would be the
prediction for the amount of fuel consumed.
Stock Market Analysis: Time series analysis is commonly used to analyse stock market data,
such as daily closing prices or trading volumes over time. It helps investors identify trends,
patterns, and make informed decisions.
Climate and Weather Patterns: Time series analysis is crucial in studying climate and
weather patterns. Meteorologists analyse historical data to understand long-term trends,
seasonal variations, and predict future weather conditions.
Economic Forecasting: Time series data plays a vital role in economic forecasting.
Economists analyse past data on indicators such as GDP, inflation rates, unemployment rates,
and interest rates to predict future economic trends and make policy recommendations.
Epidemiology and Disease Outbreaks: Time series analysis is employed to study the
spread of diseases, track outbreaks, and monitor the effectiveness of interventions. By
analysing historical data, epidemiologists can make predictions and implement appropriate
measures to control and prevent the spread of diseases.
Energy Demand and Consumption: Time series analysis is used to study energy demand
and consumption patterns. Utilities and energy companies analyse historical data to forecast
future demand, optimize energy production and distribution, and plan for infrastructure
development.
Website Traffic and User Behaviour: Online businesses use time series analysis to
understand website traffic patterns, user behaviour, and engagement metrics over time. This
helps them optimize website performance, marketing campaigns, and user experience.
Sensor Data Analysis: Time series data from sensors is used in various fields like
manufacturing, transportation, and environmental monitoring. It helps detect anomalies,
predict equipment failure, optimize maintenance schedules, and improve operational
efficiency.
Demand Forecasting: Time series analysis is used in demand forecasting for industries like
retail, supply chain management, and inventory planning. By analysing historical sales data,
businesses can predict future demand, optimize inventory levels, and improve operational
efficiency.
Social Media Analytics: Time series analysis is applied to social media data to analyse
trends, sentiment analysis, and predict user behaviour. This helps businesses understand
customer preferences, tailor marketing strategies, and improve brand reputation.
CHAPTER 4 Page 116
Examples:
• The supermarket's daily revenue recorded over a consecutive four-week timeframe
• The annual unemployment rate observed over the course of the past ten years
• The quarterly quantity of rice production, measured in tons
• The monthly production volume of a soft drink manufacturer, documented for a span of
48 months
• The count of taxi journeys performed per day during a three-week period
• The annual count of houses constructed in the Colombo city between the years 2015 and
2023
• The weekly sales figures of a company
• The yearly number of tourists arriving in Sri Lanka during the last five years
• The firm's annual revenue generated continuously for a period of ten years
The trend component represents the long-term direction of the time series, reflecting its
overall growth or decline. It captures persistent changes that extend beyond short-term
fluctuations.
The seasonal component pertains to systematic and short-term movements in the time series
that recur within specific calendar periods. These patterns can be linked to regular seasonal
factors, such as holidays or weather conditions, which influence the data.
The cyclical component encompasses medium-term fluctuations that occur over a span of
several years, typically recurring in a cyclical pattern. These fluctuations are not tied to
specific calendar periods and may be driven by economic cycles, business cycles, or other
underlying factors affecting the data.
The irregular component accounts for unsystematic and random fluctuations in the time
series that cannot be attributed to any identifiable trend, seasonality, or cyclical pattern.
These irregularities can arise from unpredictable events, measurement errors, or other
sources of noise in the data.
The four components of variation are assumed to combine to produce the variable in one of
two ways: thus, we have two mathematical models of the variable. In the first case there is
the additive model, in which the components are assumed to add together to give the
variable, Y:
Y=T+S+C+R
Y=TxSxCxR
CHAPTER 4 Page 117
4.6.2 Trend
This is the general overall movement persisting over a long period of time. There are all sorts
of trends; some series increase slowly and some increase fast, others decrease at varying
rates, and some remain relatively constant for long periods of time. The various types of
trends are divided under two headings.
• Linear trend
• Non-linear trend
a) Linear trend
A linear trend is one in which the variable is basically changing at a steady rate. If the trend
in a time series is linear, it could be estimated using the least squares method.
Illustration 4.8
The following table gives the quarterly sales figures of a small company over the last
three years.
Required:
Calculate the forecast for the next four values of the trend in the series.
2012 Q1 (t = 1) 42
Q2 (t = 2) 41
Q3 (t = 3) 52
Q4 (t = 4) 39
2013 Q1 (t = 5) 45
Q2 (t = 6) 48
Q3 (t = 7) 61
Q4 (t = 8) 46
2014 Q1 (t = 9) 52
Q2 (t = 10) 51
Q3 (t = 11) 68
Q4 (t = 12) 48
CHAPTER 4 Page 118
The given points are plotted on a graph which is given below. If we look at the diagram,
we can see that although the figures are fluctuating quarter by quarter two things can
immediately be observed. Firstly, there is a general upward trend in the sales as a whole.
It is not an exceptional rise. Secondly, although the figures fluctuate, there is a visible
pattern. The sales amount is always the highest in the third quarter of the year, and
always the lowest in the fourth quarter. This is known as seasonal variation which will
be dealt with later.
Figure
80
70
60
50
40
30
20
10
0
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
Let X and Y be the time period and the Sales (in Rs. million) respectively. Then the
following totals could be obtained manually or using a scientific calculator as mentioned
in chapter 6.
This equation would then be translated as where T = 40.712 + 1.339x where T is the
trend for a given time period x.
In numerous instances, it is not feasible to rely on a straight-line pattern to gauge the trend.
This is particularly evident when a time series exhibits a slower (or faster) rate of growth in
its early stages and a faster (or slower) rate of growth in more recent times. In such cases, a
straight-line trend is unsuitable. The "method of moving averages" is the most frequently
employed approach for quantifying non-linear trends.
When using the method of moving averages to determine a trend, we gather data for a
specific number of years (quarters, months, weeks, or days). We then calculate the average
of these values, which represents the trend value for the time unit in the middle of the period
covered by the calculation. These calculated averages are referred to as "moving averages"
because each average is determined by shifting from one overlapping set of values to the
next. The number of values in each set is consistent and referred to as the "period" of the
moving average.
If the time series is considered to be as a, b, c, d, e, f,… and so on then the 3-season moving
𝑎+𝑏+𝑐 𝑏+𝑐 +𝑑 𝑐+ 𝑑 +𝑒
average would be , , , … and so on. Similarly, the 5-season moving
3 3 3
𝑎+𝑏+𝑐+𝑑+𝑒 𝑏+𝑐 +𝑑+𝑒+𝑓 𝑐+ 𝑑 +𝑒+𝑓+𝑔
average would be , , , … and so on.
5 5 5
Illustration 4.9
The power consumption of a factory over a period of one year is shown below. Draw the
time series graph and calculate the trend in power consumption over the period of 12
months using 3-month moving average.
Month Jan Feb Mar Apr May June July Aug Sept Oct Nov Dec
power 332 316 327 338 346 288 314 340 336 350 370 363
consumed
(units)
CHAPTER 4 Page 120
Solution
Figure – 2
390
370
350
330
310
290
270
250
Jan Feb March April May June July August Sept Oct Nov Dec
If we notice the trend in power consumption, we can see that it increases up to the month of
April and thereafter decreases up to July and then again increases.
The main disadvantage of moving averages method is that no trend values are obtained for
the beginning and end time points of a series. With 3-season moving average, the first and
the last moving totals would be missing. Similarly with 5-season moving average, the first
two and the last two moving totals would be missing, and so on.
CHAPTER 4 Page 121
Note
When dealing with time series data in a business context, quarterly or monthly data is often
used. In the case of quarterly data, the appropriate moving average is the four-quarter
moving average. However, there is an issue when using an even number of seasons for the
moving average.
To address this problem, a technique called "cantering" is used. Cantering involves taking the
average of each four-quarter moving total with the subsequent moving total and placing it
between the two moving totals. These averaged values are referred to as "cantered moving
totals." Moving averages are then calculated by dividing the cantered moving totals by four.
By employing cantering, we can overcome the challenge of relating trend figures to specific
quarters and ensure accurate analysis of the time series data.
Illustration 4.10
The table below shows the number of visitors (in hundreds) to a hotel during a period
of three years. Calculate the trend forecast for the four quarters in 2015 based on the
data provided using four-quarter moving averages method.
Year Quarters
Q1 Q2 Q3 Q4
2012 36 18 22 44
2013 40 20 24 46
2014 48 20 26 56
Solution:
Q2 18 - -
120
Q3 22 122 30.5
124
Q4 44 125 31.25
126
2013 Q1 40 127 31.75
128
Q2 20 129 32.25
130
Q3 24 134 33.5
138
Q4 46 138 34.5
CHAPTER 4 Page 122
138
2014 Q1 48 139 34.75
140
Q2 20 146 36.5
152
Q3 26 - - -
Q4 58 - - -
If we notice the growth in trend between 2012 Q3 and 2014 Q2 is 6.00 over a period of
eight quarters and so the average growth would be 0.75 per quarter.
Seasonal variations refer to the recurrent, short-term cyclic fluctuations in business activity
that transpire consistently on an annual basis. Given their predicable recurrence over a span
of twelve months, these variations can be reasonably anticipated with a fair degree of
accuracy.
Examples:
• During festival seasons, there is typically a surge in garment sales as people tend to make
more purchases for special occasions and celebrations.
• Rainy seasons witness increased sales of raincoats and umbrellas as individuals seek
protection from the inclement weather.
• On rainy days, the demand for ice cream tends to be quite low as people are less inclined
to consume cold treats in such weather conditions.
• Supermarket sales tend to be higher towards the end of the week compared to the
beginning, possibly due to factors such as people stocking up for the weekend or
increased shopping activities before the start of the weekend.
The seasonal variations show on average, by how much a particular season will tend to
increase or decrease the value from the trend.
If a time series (Y) values are given together with the trend (T) values, then the procedure
for calculating the seasonal variations would be as follows:
Solution:
With additive model, the series value would be given as Y = T + C + S + R. If we assume
that there are no cyclical and random variations, then C = 0 and R = 0
Thus, we obtain that Y = T + S. This implies that S = Y - T
Q2 18 - -
Q3 22 30.5 -8.5
Q4 44 31.25 12.75
Q2 20 32.25 -12.25
Q3 24 33.5 -9.50
Q4 46 34.5 11.50
Q2 20 36.5 -16.50
Q3 26 - -
Q4 58 - -
Year Quarters
Q1 Q2 Q3 Q4
2012 - - -8.50 12.75
2013 8.25 -12.25 -9.50 11.50
2014 13.25 -16.50 - -
Total adjustments 21.50 -28.75 -18.00 24.25 = -1
Adjustment +0.25 +0.25 +0.25 +0.25 = +1
Adjusted seasonal variation 21.75 -28.50 -17.75 24.50 = 0
CHAPTER 4 Page 125
Projections for the year- 2015
The seasonal component (S) is given as the arithmetic mean of Y/T for each quarter, where
Y denotes the actual sales and T the trend obtained by using the regression equation. Adjust
your average seasonal variations so that they add to four.
Illustration 4.11
With Multiplicative model, the series (Y) value can be given as the product of the
components of the time series.
Y=TxSxCxR
If we assume that there are no cyclical and random variations, then c = 1 and R = 1
Y=TxS
S=Y/T
Year Quarter t T Sales, Y Y/T
2011 1 1 30.87 24.8 0.8034
2 2 33.19 36.3 1.0937
3 3 35.51 38.20 1.0729
4 4 37.84 47.50 1.2553
Year Q1 Q2 Q3 Q4
2011 0.8034 1.0937 1.0729 1.2553
2012 0.7769 0.9885 0.9685 1.1859
2013 0.8087 0.9423 0.9980 1.2243
2014 0.9309 0.9463 0.9510 1.0482
Total 3.3199 3.9708 3.9904 4.7137
2 2 960
2019 1,030.0
3 3 1,010 1,022.5 0.99
1,015.0
4 4 1,250 997.5 1.25
980.0
1 5 840 997.5 0.86
962.5
2 6 820 930.0 0.88
CHAPTER 4 Page 127
2020 897.5
3 7 940 917.5 1.02
937.5
4 8 990 917.5 1.03
976.3
1 9 1,000 917.5 1.00
976.3
2 10 975 1,051.3 0.93
2021 1,086.3
3 11 1,100 1,051.3 1.00
1,105.0
4 12 1,270 1,119.4 1.13
1,133.8
1 13 1,075 1,153.8 0.93
1,173.8
2 14 1,090 1,183.8 0.92
2022 1,193.8
3 15 1,260 1,203.1 1.05
1,212.5
4 16 1,350 1,228.8 1.10
1,245.0
1 17 1,150 1,253.8 0.92
1,245.0
2 18 1,220 1,275.0 0.96
2023 1,245.0
3 19 1,330
4 20 1,450
Namely:
- prosperity (or boom)
- decline
- depression
- recovery (or improvement)
CHAPTER 4 Page 128
4.6.9 Random Variations
The evaluation of cyclical and random components is not required for the purpose of
examination. Only a basic understanding of these two variations and examples for them are
expected.
With additive model, seasonally adjusted data would be Y - S whereas with multiplicative
model it would be Y / S
• By analysing time series data, we can gain insights into past trends and patterns, enabling
us to better understand how things have behaved over time.
• Time series analysis aids in making informed decisions for future operations. By studying
historical data, we can identify patterns and make predictions, helping us plan and
prepare for what lies ahead.
• Time series analysis allows us to evaluate current fluctuations and understand the factors
driving them. By examining the data in real-time, we can identify and respond to any
deviations or changes in the patterns.
Introduction to Financial
Mathematics
Learning Outcomes
The concept of time value of money (TVM) suggests that the present value of a sum of money
is higher than its future value due to the potential returns it could generate over time. TVM
is a fundamental principle in finance, asserting that a specific amount of money holds greater
value if received in the present rather than in the future.
The TVM serves as a valuable tool for making sound financial decisions. Numerous factors
influence the TVM.
P = Rs. 100,000
r = 0.1
n=2
t = 5 years
Solution
𝑟 𝑛𝑡
𝑉 = 𝑃 (1 + )
𝑛
0.1 2 ×5
𝑉 = 100, 000 (1 + )
2
V = Rs. 162,889.46
CHAPTER 5 Page 131
5.2 The concept of discounting
The term ‘present value’ simply means the amount of money that must be invested now, for
a duration of n years at an interest rate of r%, in order to accumulate a predetermined future
sum of money at the time it becomes due.
Discounting is the reverse process of compounding. Take a look at the rearrangement of the
formula below.
𝑉 = 𝑃(1 + 𝑟)𝑛
𝑉
𝑃=
(1 + 𝑟)𝑛
1
𝑃=𝑉 ×
(1 + 𝑟)𝑛
Let’s substitute PV for present value instead of P and FV for future value instead of V. Then
the formula becomes as follows:
1
𝑃𝑉 = 𝐹𝑉 ×
(1 + 𝑟)𝑛
1
Here, we define the term as the discounting factor (DCF)
(1+𝑟)𝑛
Therefore,
𝑃𝑉 = 𝐹𝑉 × 𝐷𝐶𝐹
Hence, the present value can be obtained by multiplying the future value with the discounting
factor.
Substitute the given values into the formula and perform necessary calculations using the
calculator to obtain the discounting factor. Following Table illustrates the calculation of
discounting factor for different values of r ×100 (interest rate) and n (number of
compounding periods)
CHAPTER 5 Page 132
Illustration 5.2 – Calculation of discounting factors
r n 𝟏 DCF
( 𝟏 + 𝐫 )𝐧
10% 0 1 1
( 1 + 0.1 )0
10% 1 1 0.9091
( 1 + 0.1 )1
10% 2 1 0.8264
( 1 + 0.1 )2
10% 5 1 0.6209
( 1 + 0.1 )5
15% 0 1 1
( 1 + 0.15 )0
15% 1 1 0.8696
( 1 + 0.15 )1
15% 2 1 0.7561
( 1 + 0.15 )2
15% 5 1 0.4972
( 1 + 0.15 )5
20% 0 1 1
( 1 + 0.2 )0
20% 1 year 6 months 1 0.7607
( 1 + 0.2 )1.5
20% 5 years 3 months 1 0.384
( 1 + 0.2 )5.25
12.5% 3 1 0.7023
( 1 + 0.125 )3
18.75% 8 1 0.2529
( 1 + 0.1875 )8
Solution
Using the present value table:
(a) Using the table, you can find the present value factor (or discount factor) of 0.857
for n = 2 and r=8%. This means that the present value of Rs.1/= for n = 2 and r=8%
is 0.857; hence the present value (PV) of Rs. 400,000.00 is:
1
𝑃𝑉 = 400,000 × = 𝑅𝑠. 342,935.53
(1 + 0.08)2
(b) Similarly, the PV factor (discount factor) for n=3 and r=15% is 0.658. Hence the PV
of 750,000 is:
750,000 x 0.658 = 493,500
Note: It is observed that utilising tables may result in a loss of some accuracy due to the
rounding errors of the discounting factor. Nevertheless, when numerous calculations
are involved, their usage proves to be significantly faster, making tables a preferred
choice. However, it is important to note that tables may not always be applicable due to
certain “gaps” present in them. For instance, combinations such as n = 2.5 years and r =
4.5% may not be appear in the tables, requiring the application of first principles
(calculator method) in examples involving such values.
Calculate the present values of the following amounts, using PV tables whenever
possible, and resorting to first principles otherwise.
(a) Rs. 400,000 payables in 8 years’ time at a rate of 12%
(b) Rs. 125,000 payables in 3 years’ time at a rate of 7%
(c) Rs. 250,000 payables in 5 years’ and 9 months’ time at a rate of 10%
(d) Rs. 500,000 payables in 8 years’ time at a rate of 7.35%
CHAPTER 5 Page 135
Net present value (NPV) is a financial metric used to determine the profitability of an
investment project. It represents the difference between the present value of cash inflows
and the present value of cash outflows over a given time period. NPV takes into account the
time value of money by discounting future cash flows back to their present value.
A positive NPV indicates that the investment is expected to generate more value than the
initial cost, making it potentially worthwhile. Conversely, a negative NPV suggests that the
investment may result in a loss or not meet the required return.
Additionally, at the end of five-year period, the machine can be scrapped for Rs. 55,000.
Assuming an interest rate of 10% per annum, find the NPV of the machine.
(Assume all the above cash inflows occur at the end of the year.)
Solution
A simple but simplified tabular form can be used.
Year Rs.
1 60,000
2 80,000
3 50,000
4 10,000
5.4 Annuities
An annuity is a financial product that involves a series of regular payments or receipts over
a specified number of periods. In the context of investment, an annuity represents a series of
equal payments made at equal intervals. Examples of annuities include regular deposits into
a savings account, monthly home mortgage payments, monthly insurance payments,
retirement planning and pension payments. Annuities can be categorised based on the
frequency of payment dates.
There are mainly two types of annuities according to the time the transactions occur. When
it happens at the beginning of each period is called an ordinary annuity. At the end it is said
to be an annuity due.
When comparing two or more annuities, it is important to consider that they may cover
different time periods and their net present values (NPV) become significant in making the
ultimate decision.
The present value of an annuity of amount of Rs. 1 paid for n years at the rate of r% is given
by,
1 1
[1 − (1+𝑟)𝑛]. This is called as annuity factor (or discount factor).
𝑟
This annuity factor represents the present value of an annuity of Rs. 1 period for the specified
number of periods and interest rate.
The present value of an annuity of an amount of Rs. C paid for n years at the rate of r% is
given by
1 1
𝐶 × [1 − ] = 𝐶 × Annuity factor.
𝑟 (1 + 𝑟)𝑛
Note: This interest rate is called as discount rate since it is used to discount a stream of
future cash flows to their present value.
An annuity table is a financial tool used to calculate the present value of equal cash flows over
a specific period of time. An annuity table provides the annuity factor based on different
interest rates (discount rates) and time periods. To calculate the present value of an annuity,
you multiply the annual cash flows by relevant annuity factor obtained from the table. A
portion of the table is provided below for your convenience. For example, find the present
value of an annuity that provides a payment of Rs. 10,000 per year for a duration of 5 years,
assuming an expected interest rate of 10%. First using the annuity table, locate the
corresponding annuity factor, which is 3.791. Next multiply this by the given cash flow per
year (Rs. 10,000) to calculate the present value. Hence the answer is Rs. 10,000 × 3.791 = Rs.
37,910.
Answer
Method 1: Using the annuity formula:
We know the relationship between the present value (PV) of an annuity, annuity (C) and the
annuity factor. Using this relationship, we can easily form an equation to find the annuity for
a given present value.
𝑃𝑉
𝐴𝑛𝑛𝑢𝑖𝑡𝑦 = Annuity factor
Answer
The logic of the question can be depicted as follows.
Year Year Year Year Year …………………………………….. Year
01 02 03 04 05 12
Rs. 125,000
Since the company pays out the loan money now, the present value (PV) of the loan is Rs.
125,000. The annual repayment on the loan can be thought of as an annuity. We can
therefore use the annuity formula for the calculation of annuity.
The annuity factor can be found using the annuity table is found by looking in the
cumulative present value tables under n=12 and r=12%, which is. The corresponding
factor is 6.194.
125,000
Therefore, annuity = = Rs. 20,180.82
6.194
The borrower should pay Rs. 20,180.82 each year for 12 years to repay the loan.
5.5 Perpetuity
A perpetuity is an annuity in which the periodic payments begin on a fixed date and
continue indefinitely. It is sometimes referred to as a perpetual annuity. Fixed coupon
payments on permanently invested (irredeemable) sums of money are prime examples of
perpetuities.
Perpetuity is similar to an annuity, but with the key distinction: the payments continue
indefinitely, without a predetermined end date. This makes perpetuities particularly
attractive to individuals or organisations seeking to ensure ongoing payments to their
descendants or support a long-term cause.
Since this is a type of annuity, the same formula used above to compute the present value of
annuity of an amount of Rs. C paid for n years at the rate of r% can be used.
The formula for calculating the present value of perpetuity is:
1 1 𝐶
lim 𝐶 × [1 − (1+𝑟)𝑛 ] =
𝑛→∞ 𝑟 𝑟
Where C is the cash flow received each period and r is the interest rate.
5.6 Amortisation
Amortisation is commonly associated with long-term loans such as mortgages, car loans, or
personal loans, where borrowers make regular payments over an extended period until the
loan is fully repaid.
Amortisation schedules are used by lenders, such as financial institutions, to outline the
repayment schedule of a loan based on a specific maturity date. These schedules are
commonly used for amortising loans, which involve consistent payment amount throughout
the loan term, but with varying proportions of interest and principal in each payment. A
typical example of an amortising loan is a traditional mortgage.
Step 02: Prepare a table with six columns showing years, loan at beginning, interest, total
to be paid, period payment and period-end balance.
Step 03: Complete the table with the values until the balance amount becomes zero (or
approximately zero)
A loan of Rs.700,000/- was obtained by Amal for 5 years at the interest rate of 8% per
annum. The loan is to be settled in 5 equal annual instalments.
Solution
a) 𝑃𝑉 = 𝐴𝑛𝑛𝑢𝑖𝑡𝑦 (𝐶) × Annuity factor
𝑃𝑉
𝐴𝑛𝑛𝑢𝑖𝑡𝑦 = Annuity factor
700,000
Annuity amount = = Rs. 175,319.52
3.9927
b) Amortisation schedule
Year Loan at the Interest @ 8% Total to be Annual Balance at
beginning (A) B = (𝐴 × 0.08) paid. payment the end
A+B (C) A+B - C
01 700,000 56,000 756,000 175,319.52 580,680.48
02 580,680.48 46,454.44 627,134.92 175,319.52 451,815.40
03 451,815.40 36,145.23 487,960.63 175,319.52 312,641.11
04 312,641.11 25,011.29 337,652.40 175,319.52 162,332.88
05 162,332.88 12,986.63 175,319.51 175,319.52 0
The nominal interest rate is the stated or quoted rate of interest on a financial instrument
without considering the effect of compounding.
The effective interest rate is the interest rate that takes into account the effect of
compounding, providing a more accurate measure of the cost of borrowing or the return on
investment.
The effective interest rate is typically higher than the nominal interest rate due to the
compounding effect.
To find the effective annual rate of interest, we need to consider the impact of two 6%
increases on an initial value of Rs. 1.
CHAPTER 5 Page 143
The value at the end of 1 year = 1 x 1.06 x 1.06 = Rs. 1.1236
If we deduct the initial Rs. 1 from the final value we get Rs. 0.1236. This represents the
interest earned on Rs. 1 over one year.
Expressed as a percentage, this is 12.36% per annum, which is higher than the stated
nominal rate of 12%
Therefore, the nominal rate is 12% per annum, while the effective rate of interest is
12.36% per annum.
Inflation is a sustained increase in the general price level of goods and services in an economy
over a period of time. When the price level rises, each unit of currency buys fewer goods and
services. Consequently, inflation reflects a reduction in the purchasing power per unit of
money – a loss of real value in the medium of exchange and unit of account within the
economy. A chief measure of price inflation is the inflation rate, the annualised percentage
change in the general price index. The opposite of inflation is deflation.
Institute of Certified Management Accountants of Sri Lanka
29/24, Visakha Private Road, Colombo 04, Sri Lanka
Tel : +94 (0)11 2506391, 2507087, 4641701-3 , Fax : Ext 118
www.cma-sri lanka.org
secretariat@cma-srilanka.org
ISBN 978-955-0926-43-5