0% found this document useful (0 votes)
176 views22 pages

Contemporary Math (Statistics - Docx Semi

This module discusses measures of dispersion and location for data sets. It defines key terms like standard deviation, variance, range, average deviation, quartiles, percentiles, and others. Standard deviation measures how widely values are dispersed from the average and is a measure of volatility. The module provides examples of calculating measures of dispersion like range, average deviation, and standard deviation for both individual data sets and grouped data sets. It discusses the characteristics, uses, advantages and disadvantages of different measures of dispersion.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
176 views22 pages

Contemporary Math (Statistics - Docx Semi

This module discusses measures of dispersion and location for data sets. It defines key terms like standard deviation, variance, range, average deviation, quartiles, percentiles, and others. Standard deviation measures how widely values are dispersed from the average and is a measure of volatility. The module provides examples of calculating measures of dispersion like range, average deviation, and standard deviation for both individual data sets and grouped data sets. It discusses the characteristics, uses, advantages and disadvantages of different measures of dispersion.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

NAGA COLLEGE FOUNDATION, INC.

M.T. Villanueva Avenue, Naga City


College of Business and Accountancy

Contemporary Math
including (Statistics)
Module: Semi – Final

Name, Course, Block, and Schedule

Prepared by:
Alejandro P. Alaurin II

Measures of Dispersion and location

1
OVERVIEW: This module presents the important characteristic of a data set is how it is
distributed, or how far each element is from some measure of central tendency (average). There are
several ways to measure the variability of the data. Although the most common and most important is the
standard deviation, which provides an average distance for each element from the mean, several others
are also important, and are hence discussed here.
Standard deviation is a statistical term that provides a good indication of volatility. It measures how
widely values are dispersed from the average. Dispersion is the difference between the actual value and
the average value.
This chapter also includes difference measurement of dispersion (range, average deviation, variance, and
standard deviation), coefficient of variation, location (quartiles, deciles, and percentiles) and other related
topics like mid hinge, interquartile range and quartile deviation. In #he later part of the chapter it deals
with kurtosis, skewness, determining outliers, plotting of boxplot and the effects of changing the units on
measures of dispersion and location.

LEARNING OUTCOMES:
After completing this chapter, the students will able to:
- Compute the different measures of dispersion for both grouped and ungrouped
- Discuss the uses, characteristics, advantages and disadvantages of measures of dispersions,
- Solve and interpret quartiles, deciles, and percentiles.
- Compute and interpret the coefficient of variation, the coefficient of skewness a kurtosis.
Differentiate coefficient of skewness and kurtosis.

DISCUSSION:

Range
Probably the simplest and easiest way to determine measure of dispersion is the range. The range
is the difference of the highest value and the lowest value in the data set.
There are two advantages of the range that is it is easy to compute and easy to understand On the
other hand, it also has two disadvantages, it can be distorted by a single extreme value (or outlier) and
only two values are used in the calculation.
Example: The daily rates of a sample of eight employees at GMS Inc. are P550, P420, P560, P500 P700,
P670, P860, P480. Find the range.
Solution:
Step 1: Determine the highest value and lowest value in the data set.
Highest Value (HV) = P860 Lowest Value (LV) = P420
Step 2: Solve for the range.
Range-Highest Value (HV)- Lowest Value (LV) -P860-P420 -P440
The range in daily rate salary is P440.

Average Deviation
In statistics, the Average deviation of an element of a data set is the absolute difference between
that element and a given point. Typically the point from which the deviation is measured is a measure of

2
central tendency. It is a summary statistic oft statistical dispersion o variability., It is also called the mean
absolute deviation.

AD =
∑ ¿ x−μ/¿ ¿ or AD =
∑ ¿ x−m/ ¿ ¿
N N
Where AD = average deviation
μ = population mean
m = sample mean
n = sample population
N = population

Example. The daily rates of a sample of eight employees at GMS Inc. are 550, 420, 560, 500, 700, 670,
860, 480. Find the average deviation.

Solution:

Step 1 : Compute the mean of the data set .

m=
∑ x = 550+420+560+500+700+670+860+480 = 4,740 =592.50
n 8

Step 2 : Subtract the mean from each of the value in the data set.
X X–m Found by
550 -42.5 550 – 592.50 =42.5
420 -172.5 420 – 592.50 = -172.50
560 -32.5 560– 592.50 = -32.50
500 -92.5 500 – 592.50 = -92.50
700 107.5 700 – 592.50 = 107.50
670 77.5 670 – 592.50 = 77.50
860 267.5 860 – 592.50 = 267.50
480 -112.5 480 – 592.50 = -112.50
∑ x =4,740 ∑ ( x ¿−m)¿ = 0 Notice that the sum of x – m is
always equal to zero

Step 3. Get the absolute values of x – m, then get the sum


X X–m /x–m/
550 -42.5 42.5
420 -172.5 172.5
560 -32.5 32.5
500 -92.5 92.5
700 107.5 107.5
670 77.5 77.5
860 267.5 267.5
480 -112.5 112.5
∑ x =4,740 ∑ (x ¿−m) ¿ = ∑ ¿ x−m/¿ = 905
0

Step 4. Solve for the average deviation.

3
or AD =
∑ ¿ x−m/¿ = 905 =113.125=113.13 ¿
n 8
Hence, the average deviation of the data is 113.13

B. Average deviation for Grouped data

AD =
∑ f /x−μ /¿ ¿ or AD =
∑ f /x−m/¿ ¿
N N
Where AD = average deviation
f = frequency
X = the value of any particular observations or measurement
μ = population mean
m = sample mean
n = sample population
N = population

Example: The data below shows the frequency distribution of the amounts of electric consumption of a
typical household in Batangas City for the month of January 2009. Find the average deviation.
Amount of 7000-849 850-999 1000- 1149 1150 -1299 1300-1499
electric bill

Number of 2 9 15 9 5
Families

Solution:
Step 1: Compute the mean of the frequency distribution.
Class limits F X X is found by Fx Fx is
Found by (
f times x )
700 – 849 2 774.5 700+849/2 = 774.5 1,549 2 x 774.5
850 – 999 9 924.5 850+999/2 = 924.5 8,320.50 9 x 924.5
1000 – 1149 15 1074.5 1000+1149/2 = 1074.5 16,117.50 15 x 1074.5
1150 – 1299 9 1224.5 1150+1299/2 = 1224.5 11,020.50 9 x 1224.5
1300 – 1499 5 1374.5 1300+1499/2 = 1374.5 6,872.50 5 x 1374.5
40 ∑ f x=34,880
∑ f x=34,880−∑ of all fx
Step 2: Subtract the mean from each of the value in the data set.
Class limits F X X–m X – m is Found by
700 – 849 2 774.5 -322.5 774.5 -1097 = -322.5
850 – 999 9 924.5 -172.5 924.5 -1097 = -172.5
1000 – 1149 15 1074.5 -22.5 1074.5 -1097 = -22.5
1150 – 1299 9 1224.5 127.5 1224.5 -1097 = 127.5
1300 – 1499 5 1374.5 277.5 1374.5-1097 = 277.5

4
40
Step 3: Get the absolute value of X - m
Class limits F X X–m /X – m/
700 – 849 2 774.5 -322.5 322.5
850 – 999 9 924.5 -172.5 172.5
1000 – 1149 15 1074.5 -22.5 22.5
1150 – 1299 9 1224.5 127.5 127.5
1300 – 1499 5 1374.5 277.5 277.5
40
Step 4: Obtain the product of /x – m/ and f, and then add
Class limits F X X–m /X – m/ f/X-m/ f/x-m/ is
found by
700 – 849 2 774.5 -322.5 322.5 645.0 2 x 774.5
850 – 999 9 924.5 -172.5 172.5 1,552.5 9 x 924.5
1000 – 1149 15 1074.5 -22.5 22.5 337.5 15 x 22.5
1150 – 1299 9 1224.5 127.5 127.5 1,147.5 9 x 127.5
1300 – 1499 5 1374.5 277.5 277.5 1,387.5 5 x 277.5
40 ∑ f / x−m/¿ ¿
=5,070

Step 5: solve for the average deviation

AD =
∑ f /x−m/¿ ¿ 5,070 = 126.75
N 40
Hence, the average deviation of data set is 126.75.
Variance and standard deviation
One of the most widely used measures or dispersion is the standard deviation. The more spread apart the
data, the higher the deviation. Standard deviation is calculated as the square root of variance. In finance,
standard deviation is applied to the annual rate of return of an investment to measure the investment's
volatility. Standard deviation is also known as historical volatility and is used by investors as a gauge for
the amount of expected volatility.
A measure of the dispersion of a set of data points around their mean value. Variance is a mathematical
expectation of the average squared deviations from the mean. Volatility is a measure of risk, so this
statistic can help determine the risk an investor might take on when purchasing a specific security.
This section also includes discussion on the range rule of thumb and computation the variance
and standard deviation both for group and ungrouped data.

Range rule of Thumb


The range can be used to approximate the standard deviation; it is called the range rule of thumb. A
rough estimate of the standard deviation is
range
S=
4
Note: The range rule of thumb is only approximation and should be used when the distribution of
data values is unimodal and roughly symmetric.
Sample Variance and sample Standard Deviation for Ungrouped data
2
s=
∑ (x−m)2 ( Formula 4-4)
n−1


2
(x −m)
s= ( Formula 4-5)
n−1

5
s =∑ x −¿ ¿¿ ¿ (Formula 4-6)
2 2

s= √∑ x −¿¿ ¿ ¿ ¿ ( Formula 4-7)


2

Where s2=sample variance


s=sample standard deviation
x=the value of any particular observation∨measurement
∑ x=∑ of all Xs .
∑ x 2=∑ of all the square of Xs
m=sample mean
n=sample population
Example 1: The daily rates of a sample of eight employees at GMS Inc. are
550,420,560,500,700,670,860,480. Find the variance and standard deviation.
Solution:
Step 1: Compute the mean of the data set
∑ x 550+ 420+560+500+700+670+ 860+480
m= = = 592.50
n 8
Step 2: Subtract the mean from each of the value in the data set.
X X–m
550 550 – 592.50 = -42.5
420 420– 592.50 = -172.5
560 560 – 592.50 = -32.5
500 500 – 592.50 = -92.5
700 700 – 592.50= 107.5
670 670 – 592.50= 77.5
860 860 – 592.50= 267.5
480 480 – 592.50= -112.5
∑ x=4,740 ∑ ( x−m )=0

Step 3: Square the X – m then get the sum.


X X–m (x−m)2
550 550 – 592.50 = -42.5 (−42.5)2 =1,806.25
420 420– 592.50 = -172.5 2
(−172.5) =29,756.25
560 560 – 592.50 = -32.5 (−32.5)2=8,556.25
500 500 – 592.50 = -92.5 (−92.5)2=11,556.25
700 700 – 592.50= 107.5 (107.5)2=6,006.25
670 670 – 592.50= 77.5 (77.5)2=1,806.25
860 860 – 592.50= 267.5 2
(267.5) =71,556.25
480 480 – 592.50= -112.5 2
(−112.5) =12,656.25
∑ x=4,740 ∑ ( x−m )=0 2
∑ ( x−m ) =142,950

Step 4: Solve for the variance and the standard deviation using formula 4-4 and formula 4-5. We can also
obtain the standard deviation by simply extracting the square root of the variance.
2
s=
∑ (x−m)2 ( Formula 4-4 )
n−1

6
142,950
( sample variance )=s2= =20,421.43
8−1

s=
√ ∑ (x −m)2
n−1
( Formula 4-5)


( standard deviation )=s= 142,950 =142.90
8−1

Alternative solution: An alternative solution can be done using formula 4-6 and formula 4-7
Step 1: Get the sum of the data set.
X
550
420
560
500
700
670
860
480
∑ x=4740
Step 2: Square all the values in the data set and get the sum
X x
2

550 2
550 =302,500
420 420 2
= 176,400
560 2
560 =313,600
500 5002=250,00
700 2
700 = 490,000
670 6702= 448,900
860 2
860 =739,600
480 2
480 = 230,400
∑ x=4740 2
∑ x =2,951,400

Step 3: Apply formula 4-6 and 4-7 to obtain the values of the variance and standard deviation.
s =∑ x −¿ ¿¿ ¿ (Formula 4-6)
2 2

2
(4,740)
2,951,400− 2,951,400−2,808,450
8 = = 20,421.43
( variance ) s 2= 7
8−1
s= √∑ x −¿¿ ¿ ¿ ¿
2


(4,740)2
( Standard Deviation) 2,951,400− =
8
s= ¿
8−1

√ 2,951,400−2,808,450
7
=√ 20,421.43=142.90
Sample Variance and Sample Standard deviation for grouped data

7
2
s=
∑ f ( x−m)
2
( Formula 4-8)
n−1

s=
2
√∑ f ( x−m)2 ( Formula 4-9)
n−1
s =∑ fx −¿ ¿ ¿¿ (Formula 4-10)
2


s= ∑ fx 2−¿ ¿¿ ¿ ¿ ( Formula 4-11)
Where s2=sample variance
s=sample standard deviation
x=the value of any particular observation∨measurement
∑ fx=∑ of all the product of f ∧Xs .
∑ fx =∑ of all the product of f ∧square of Xs
2

m=sample mean
f =frequency
n=sample population

Example 2: Determine the variance and standard of the frequency distribution on the age of 50 people
taking travel tours
Class Limits frequency
18-26 3
27-35 5
36-44 9
45-53 14
54-62 11
63-71 6
72-80 2

Solution:
Step 1: Determine the midpoints on each class limit.
Class Limits Frequency or f Midpoints or X
18-26 3 (18 + 26) ÷ 2 = 22
27-35 5 (27 + 35) ÷ 2 = 31
36-44 9 (36 + 44) ÷ 2 = 40
45-53 14 (45 + 53) ÷ 2 = 49
54-62 11 (54 + 62) ÷ 2 = 58
63-71 6 (63 + 71) ÷ 2 = 67
72-80 2 (72 + 80) ÷ 2 = 76

Step 2: Determine the product of fX


Class Limits F Midpoints or X fX
18-26 3 (18 + 26) ÷ 2 = 22 3 x 22 = 66
27-35 5 (27 + 35) ÷ 2 = 31 5 x 31 = 155
36-44 9 (36 + 44) ÷ 2 = 40 9 x 40 = 360

45-53 14 (45 + 53) ÷ 2 = 49 14 x 49 = 686


54-62 11 (54 + 62) ÷ 2 = 58 11 x 58 = 638
63-71 6 (63 + 71) ÷ 2 = 67 6 x 67 = 402

8
72-80 2 (72 + 80) ÷ 2 = 76 2 x 76 = 152

Step 3: determine the sum of fX


Class Limits F Midpoints or X fX
18-26 3 (18 + 26) ÷ 2 = 22 3 x 22 = 66
27-35 5 (27 + 35) ÷ 2 = 31 5 x 31 = 155
36-44 9 (36 + 44) ÷ 2 = 40 9 x 40 = 360
45-53 14 (45 + 53) ÷ 2 = 49 14 x 49 = 686
54-62 11 (54 + 62) ÷ 2 = 58 11 x 58 = 638
63-71 6 (63 + 71) ÷ 2 = 67 6 x 67 = 402
72-80 2 (72 + 80) ÷ 2 = 76 2 x 76 = 152
Total n = 50 ∑fx = 2,459

Step 4: Apply the formula of getting the mean

∑ fx 2,459
m= = =49.18
n 50
Step 5: Subtract the mean from each of the value in the data set.

Class Limits f Midpoints or X fX ( x−m ¿


18-26 3 (18 + 26) ÷ 2 = 22 3 x 22 = 66 22 – 49.18= -27.18
27-35 5 (27 + 35) ÷ 2 = 31 5 x 31 = 155 31 – 49.18=-18.18
36-44 9 (36 + 44) ÷ 2 = 40 9 x 40 = 360 40 – 49.18= -8.18
45-53 14 (45 + 53) ÷ 2 = 49 14 x 49 = 686 49 – 49.18=-0.18
54-62 11 (54 + 62) ÷ 2 = 58 11 x 58 = 638 58 – 49.18= 8.82
63-71 6 (63 + 71) ÷ 2 = 67 6 x 67 = 402 67 – 49.18=17.82
72-80 2 (72 + 80) ÷ 2 = 76 2 x 76 = 152 76 – 49.18=26.82
Step 6: Square the x - m
Class f X fX ( x−m ¿ (x−m)
2

Limits
18-26 3 22 66 -27.18 2
(−27.18) =738.7534
27-35 5 31 155 -18.18 2
(−18.18) = 330.5142
36-44 9 40 360 -8.18 (8.18)2=84.2724
45-53 14 49 686 -0.18 2
(−0.18) =0.0324
54-62 11 58 638 8.82 (8.82)2= 77.7924
63-71 6 67 402 17.82 (17.82)2 = 317.5524
72-80 2 76 152 26.82 2
(26.82) =719.3124
Step 7: Get the product of f and(x−m)2 , then obtinthe ∑ .

Class f X fX ( x−m ¿ (x−m)2 f ( x−m)2


Limits
18-26 3 22 66 -27.18 738.7534 3 x 738.7534 =2,216.2572
27-35 5 31 155 -18.18 330.5142 5 x 330.5142 = 1,652.5620
36-44 9 40 360 -8.18 84.2724 9 x 84.2724 = 758.3516
45-53 14 49 686 -0.18 0.0324 14 x 0.0324 = 0.4536
54-62 11 58 638 8.82 77.7924 11 x 77.7924= 855.7164

9
63-71 6 67 402 17.82 317.5524 6 x 317.5524 = 1,905.3144
72-80 2 76 152 26.82 719.3124 2 x 719.3124 = 1,438.6248
Total n = 50 ∑ f ( x−m)2 = 8,827.3800

Step 8: Apply the formula 4-8 and formula 4-9 to obtain the value of variance and standard
deviation for grouped data.

2
s=
∑ f ( x−m)
2
( Formula 4-8)
n−1
2 8,827.3800
s= = 180.15
50−1

s=

∑ f ( x−m)2 ( Formula 4-9)
n−1
s=
√ 8,827.3800
50−1
=√ 180.15=13.42
Alternative solution: An alternative solution can be obtained by applying the formula 4-10 and 4-11.
Step 1: Determine the midpoints on each class limit.
Class Limits Frequency or f Midpoints or X
18-26 3 (18 + 26) ÷ 2 = 22
27-35 5 (27 + 35) ÷ 2 = 31
36-44 9 (36 + 44) ÷ 2 = 40
45-53 14 (45 + 53) ÷ 2 = 49
54-62 11 (54 + 62) ÷ 2 = 58
63-71 6 (63 + 71) ÷ 2 = 67
72-80 2 (72 + 80) ÷ 2 = 76

Step 2: Multiply each class frequency (f) with the corresponding midpoint (X) to obtain the product of
fX
Class Limits F Midpoints or X fX
18-26 3 22 3 x 22 = 66
27-35 5 31 5 x 31 = 155
36-44 9 40 9 x 40 = 360
45-53 14 49 14 x 49 = 686
54-62 11 58 11 x 58 = 638
63-71 6 67 6 x 67 = 402
72-80 2 76 2 x 76 = 152

Step 3: Get the sum of Product fX


Class Limits F Midpoints or X fX
18-26 3 22 3 x 22 = 66
27-35 5 31 5 x 31 = 155
36-44 9 40 9 x 40 = 360
45-53 14 49 14 x 49 = 686
54-62 11 58 11 x 58 = 638
63-71 6 67 6 x 67 = 402
72-80 2 76 2 x 76 = 152

10
Step 4: Multiply Fx with X to obtain the product of fx 2. .

Class Limits f X fX fx
2

18-26 3 22 66 66 x 22 = 1,452
27-35 5 31 155 155 x 31 =4,805
36-44 9 40 360 360 x 40 = 14,400
45-53 14 49 686 686 x 49 = 33,614
54-62 11 58 638 638 x 58 = 37, 004
63-71 6 67 402 402 x 67 = 26, 934
72-80 2 76 152 152 x 76 = 11,552
Total n = 50 ∑ fx=2,459 2
∑ f x =129,761

Step 5: Apply the formula -10 and 4-11 to obtain the value of variance and standard deviation for
grouped data.
s =∑ fx −¿ ¿ ¿¿ (Formula 4-10)
2 2

s2=∑ fx2 −¿ ¿ ¿¿
s= √∑ fx −¿ ¿¿ ¿ ¿ ( Formula 4-11)
2


(2,459)2
s= √∑ fx −¿ ¿¿ ¿ ¿ =
2 129,761−
50−1
50 =
49 √
129,761−120,933.92
= √180.15=13.42

Notice that we obtained the same result for the value of the variance (180.15) and standard
deviation (13.42).

Quartiles and Deciles and Percentiles


When presenting or analyzing data set it is sometimes helpful to group subjects into several equal
groups. For example, to create four equal groups we need the values that split the data such that 25% of
the observations are in each group. The cut off points are called quartiles, and there are three of them (the
middle one also being called the median). The general term for such cut off points is quantiles; other
values likely to be encountered are deciles, which split the data into 10 parts, and percentiles, which split
the data into 100 parts also called centiles. Values such as quartiles can also express as percentiles; for
example, the lowest quartile is also the 25th percentile and the median is the 5th percentile or the 5th decile.

Quartiles for Ungrouped data


k (N +1)
Qk = ( Formula 4-12)
4

Where: Qk =Quartile
N=Population
k =quartile location
Example 1: find the first, second, and third quartiles of the ages of 9 middle management
employees of a certain company. The ages are 53,45,59,48,54,46,51,58 and 55.
Solution:
Step 1: Arrange the data in order.
45, 46, 48, 51, 53, 54, 55, 58, 59

11
Step 2: Select the first, second, and third quartiles value using formula 4-12.
1(9+1)
Q 1= =2.5
4
2(9+1)
Q 2= =5
4
3(9+1)
Q 3= =7.5
4
Step 3: Identify the first, second and third quartiles values in the data set.
45, 46, 48, 51, 53, 54, 55, 58, 59

2.5th 5th 7.5th

Since the 2.5 falls between 46 and 48; and 7.5 th falls between 55 and 58 we can determine the
th

first quartile and third quartiles of the data set by getting the average of the two values.
46+ 48 55+58
Q 1= =47 Q3= =56.5
2 2
Therefore, Q 1=47 , Q 2=53 ,∧Q3=56.5

Quartiles for Grouped data

[ ]
kn
−cf
4
Qk =LB+ (i )
f

Where: Qk =Quartile
N=Population
k =quartile location
LB=Lower boundary of quartile class
f =frequency of quartile class
cf =cumulative frequency before the quartile class
i=class interval
Example : determine the Q1 ,Q2 , ¿ Q3 of the frequency distribution on the ages of 50 people taking
travel tours.

Class Limits f
18-26 3
27-35 5
36-44 9
45-53 14
54-62 11
63-71 6
72-80 2

Solution:

Step 1: Construct the cumulative frequency column in the table


Class Limits F Cf

12
18-26 3 0+3 = 3
27-35 5 0+3+5 = 8
36-44 9 0+3+5+9 = 17
45-53 14 0+3+5+9+14= 31
54-62 11 0+3+5+9+14+11= 42
63-71 6 0+3+5+9+14+11+6=48
72-80 2 0+3+5+9+14+11+6+2=50
Total n = 50

Step 2: Determine the Q 1 class.


1 N 50
Q1 ( ranked value ) = =12.5
4 4
Step 3: Identify the Q 1 class by locating the 12.5th ranked in the table.
Class Limits F Cf
18-26 3 3
27-35 5 8
36-44 9 17 Q1 class
45-53 14 31
54-62 11 42
63-71 6 48
72-80 2 50

Step 4: Determine the value of LB, cf, f, I and N


Class Limits F Cf
18-26 3 3
27-35 5 Cf = 8
36-44 9 17 Q1 class
45-53 14 31
LB = 36- 54-62 11 42
0.5 = 35.5 63-71 6 48
72-80 2 50

Step 5: apply the formula to compute for the value of the first quartile.

[ ] [ ]
N 50
−cf −8
Q 1=L B+
4
f
(i )
= 35.5 + 4
9
( 9 )=35.5
9( )
12.5−8 (9) = 35.5 + 4.5 =40

Thus, Q 1 is 40, observed that Q1 will fall within the class boundary of Q 1 class
Step 6: Applying the same procedure to obtain the value of Q2∧Q 3
Locate the second quartile rank
2 N 2(50)
Q2 ( ranked value ) = =25
4 4
Class Limits F Cf
18-26 3 3
27-35 5 8

LB = 45-0.5 = Q2 class 13
44.5
36-44 9 Cf = 17
45-53 14 31
54-62 11 42
63-71 6 48
72-80 2 50

[ ] [ ]
2N 2 (50 )
−cf — 17
Q 2=L B+
4
f
(i)
= 44.5 + 4
14
( 9 )=44.5+
24−17
14 ( )
( 9 )=49.64

Locate the third quartile rank:


3 N 3(50)
= =37.5
4 4
Class Limits F Cf
18-26 3 3
27-35 5 8
LB = 54-0.5 = 36-44 9 17
53.5 45-53 14 Cf= 31
54-62 11 42 Q3 class
63-71 6 48
72-80 2 50

[ ] [ ]
3N 3 ( 50 )
−cf — 31
4 = 44.5 + 4
Q3=L B+ (i) ( 9 ) =58.82
f 11
Deciles and percentiles for Grouped Data.

[ ]
kN
−cf
10
D k = L B+ (i)
f
Where: Dk = Decile
N=Population
k =decile location
LB=Lower boundary of decile class
f =frequency of decile class
cf =cumulative frequency before the decile class
i=class interval

[ ]
kN
−cf
100
Pk =L B+ (i )
f
Where: Pk =Percentile
N=Population
k =Percentile location
LB=Lower boundary of Percentile class
f =frequency of Percentile class

14
cf =cumulative frequency before the Percentile class
i=class interval

Example. Using the example provided in the quartiles on SJS Travel Agency. Determine the D7
and P22 of the frequency distribution on the ages of 50 people taking travel tours.

Class Limits f
18-26 3
27-35 5
36-44 9
45-53 14
54-62 11
63-71 6
72-80 2

Solution:
Step 1: Construct the cumulative frequency column in the table
Class Limits F Cf
18-26 3 0+3 = 3
27-35 5 0+3+5 = 8
36-44 9 0+3+5+9 = 17
45-53 14 0+3+5+9+14= 31
54-62 11 0+3+5+9+14+11= 42
63-71 6 0+3+5+9+14+11+6=48
72-80 2 0+3+5+9+14+11+6+2=50
Total n = 50
Step 2: Determine the D 7 class.
7 ( 50 )
D 7 (Ranked Value) = 7 N = =35
10 10
Step 3: Identify the D7 class by locating the 35th ranked in the table
Step 4: Determine the values of LB, cf, f, i, and N

Class Limits F Cf
18-26 3 3
27-35 5 8
36-44 9 17
45-53 14 Cf = 31
LB = 54 –0.5 54-62 11 42
63-71 6 48 D 7 class
= 53.5
72-80 2 50
Total n = 50

Step 5: Apply the formula to compute for the value of the seventh decile

15
[ ]
7N
−cf
10
D7=L B+ (i)
f

[ ]
7 ( 50 )
−31
D 7=53.5+
10
11
( 9 )=53.5+ [ 35−31
11 ]
( 9 )=53.5+3.2727272724=56.77

Thus, the D 7 is 46.77, observed that D7 will fall within the class boundary of D 7 class
Step 1: Determine the value of P22 we shall be guided by the following steps.
22 N 22 (50 )
P22 ( ranked value )= = =11
100 100

Step 2: Identify the P22 class by locating the 11th ranked in the table
Step 3: Determine the values of LB, cf, f, i, and N.
Class Limits F Cf
18-26 3 3
LB = 36–0.5 27-35 5 Cf = 8 P22 class
= 35.5 36-44 9 17
45-53 14 31
54-62 11 42
63-71 6 48
72-80 2 50
Total n = 50

Step 5: Apply the formula

[ ]
kN
−cf
100
Pk =L B+ (i )
f

[ ]
22(50)
−8
P22 =35.5+
100
9
( 9 )=35.5+
11−8
9 [ ]
( 9 ) =¿ 35.5+3=38.5

Thus, the P22 =¿38.5, observed that the P22 will fall within the class boundary of P22 class

Coefficient of Variation
In any given two samples with the same units of measures, the variance and standard
deviation for each can be compared. In cases when one is interested to compare standard
deviations of two different units, coefficient of variation can be applied. The coefficient of
variation, denoted by CV, is standard deviation divided by the mean. The result expressed as a
percentage.

s
For sample CV = ( 100 % )
m

16
σ
For population CV =
μ
( 100 % )
Where:
CV = coefficient of variation
m = sample mean
s = sample standard deviation
σ = population standard deviation
μ=Population mean

Example: The average age of the engineers at VSAS Pipeline Corporation is 30 years, with a standard
deviation of 3; the average monthly salary of the engineers is 45,000 with standard deviation of 3,140 .
Determine the coefficient of variations of age and salary.

Solution:
Collect the information needed to compute for the values of coefficient of variations
s1=3 m 1=30 s2=3,150 m2=45,000
Compute for the coefficient of variation of age and salary

s1 3
CV 1= (100%) = (100 % ) = 10% age
m1 30
s 3,150
CV 2= 2 (100%) = (100 %) = 7% salary
m2 45,000
Since the coefficient of variation is larger for age, the ages are more variable than the salary.

Interpretation and Uses of Standard Deviation


As already mentioned, the variance and standard deviation of a variable can be used to determine
the dispersion, or spread, of a variable. Specifically, the larger the variance and standard deviation, the
more the data values are spread or dispersed. The Russian mathematician P.L. Chebychev (1891-1894)
developed a theorem that specifies the proportions of the spread in terms of the standard deviation.

Chebyshev’s Theorem. For any set of observations, the proportion of the values that lie within k standard
1
deviations of the mean is at least 1 – , where k is any constant greater than 1.
k2

Example 1: the mean price of laptop computer is 25,500 and the standard deviation is 2,500. Find the
price range for which at least 88.89% of the laptop will sell.
Solution:
Chebychev’s theorem states that 88.89%, of the data values will fall within 3 standards of the mean.
Hence,
25,500 + 3(2,500) = 25,000 + 7,500 = 33,000
25,500 – 3(2,500) = 25,500 – 7,500 = 18,000
Therefore, at least 88.89% of all laptop sold will gave a price range from 18,000 and 33,000

Example 2: A survey conducted by commission on Higher Education ( CHED ) found that the mean
amount of training allowance for department heads of college and universities was 25,000. The standard
deviation was 1,500. Using Chevychev’s theorem. Find the minimum.

Solution:

17
Step 1: Subtract the mean from the larger value.
28,000 – 25,000 = 3,000
Step 2: Divide the difference by the standard deviation to obtain k.
3,000
k= =2.0
1,500
Step 3: Use Chebychev’s theorem to determine the percentage.
1 1 1
1− =1− 2 =1− =1−0.25=0.75∨75 %
k
2
2.0 4
Therefore, at least 75% of the data value will fall between 22,000 and 28,000

Kurtosis
Kurtosis is from the Greek word kyrtos or kurtos, meaning bulging. In statistics kurtosis is a
statistical measure used to describe the distribution of observed data around the mean. It measure the
relative peakedness or flatness of a distribution (as compared to normal distribution, which shows a
kurtosis of zero). A kurtosis of data set is computed using the formula:

{[ ][ ∑( ) ]}−3¿ ¿
n 4
n ( n+1 ) x−m
Kurt
( n−1 ) ( n−2 )( n−3 ) i=1 s

Where:
Kurt = kurtosis
n = sample population
X = the value of any particular observations or measurement
m = sample mean
s = sample standard deviation

Three types of kurtosis

Leptokurtic are distributions where values clustered heavily or pile up in the center. There are tall
distribution with narrow humps and long and high tails. Its kurtosis is positive, (kurtosis > 0 ) and it
denotes a high degree of peakedness.
Mesokurtic are intermediate distribution which are neither too peaked nor too flat. The values are
immediately distributed about the center. Its kurtosis is zero ( kurtosis = 0)
Platykurtic are flat distribution with values more evenly distributed about the center with broad humps
and shot tails. Its kurtosis is negative ( kurtosis < 0 ) and it denotes a low degree of peakedness.

Skewness
The coefficient of skewness measures the general shape of the distribution or the lack of symmetry of a
distribution. It ranges from-3 to +3 and it relates the difference between the mean and the median to the
standard deviation. The direction of the long tail of the distribution points the direction of the skewness.

18
Skewness is extremely important to finance and investing. Most sets of data, including stock prices and
asset returns, have either positive or negative skew rather than following the balanced normal distribution
(which has a skewness of zero). By knowing which way data is skewed, one can better estimate whether a
given (or future) data point will be more or less than the mean. Most advanced economic analysis models
study data for skewness and incorporate this into their calculations. Skewness risk is the risk that a model
assumes a normal distribution of data when in fact data is skewed to the left or right of the mean.
Types of Distribution
1.Symmetrical Distribution. When the data values are evenly distributed on both sides of the mean. Also
the distribution is unimodal and the mean, median, and mode are similar and are at the center of the
distribution.
2. Positively Skewed Distribution (or Right-Skewed Distribution), When most of the values in the data
fall to the left of the mean and group at the lower end of the distribution the tail is to the right. Also, the
mean is to the right of the median, and the mode is to the left of the median
3. Negatively Skewed Distribution (or Left-Skewed Distribution). When the mass of the data values fall
to the right of the mean and group at the upper end of the distribution, with the tail to the left. In addition,
the mean is to the left of the median, and the mode is to the right of the
median.

Activity

1. The following data: 435, 282, 350, 420, 340, 395, 339, and 375, are the prices of seven
books randomly selected from a university bookstore. Complete the table and find the
average deviation.
X X–m /X-m/
435
282
350
420
340

19
395
339
375
Total
Find;
n = ____________
∑x = ________________
m =_______________
∑(x-m) = ________________
∑/x-m/ = ________________

2. The supervisor of a fast-food restaurant selected several receipts at random. The amount
spent by customers were 75,60,65,62,80,83,89 and 78. Complete the table and find the
variance and standard deviation.
X x-m (x−m)
2

75
60
65
62
80
83
89
78
Total

Find;
n = ____________
∑x = ________________
m =_______________
∑(¿ = ________________
2
s = ________________
s = _________________
3. The table below gives the 25 frequency distribution of the number of orders received
each day during the past 25 days at the office of a mail order company. Complete the
table and compute the variance and standard deviation.
Solution:
Number of f X fX (X-m) f (x−m)
2

orders
7-9
10-12
13-15
16-18
19-21
22-24
Total

20
Find;
m = ____________
∑f¿ = ________________
2
s = ________________
s = _________________

4. Complete the table and the following information, and then find Q 1 .
Class limits F Cf
1–7 1
8 – 14 4
15 – 21 6
22 – 28 5
29 – 35 2
36 – 42 4

N = ______________
LB = __________________
f = ____________
cf=_____________
i =___________________
Q1 .= _________________

5. The average cost of a sofa set is 36,500. The standard deviation is 4,500. Using
Chebeyshev’s theorem find the minimum percentage of the data values that will
fall in the range of 26,600 and 46,400.

21
Reference:
1.Intro the business statistics by Winston S. Sirug Ph. D.
2. Basics Statistics by Elizabeth Parreño & Ronel Jimenez
Mindshapers., Inc., 2011
C&E Publishing Inc., 2014
3. Statistics by Zenon R. Abao, Hdji C. Alegre, et al.
4. Fundamentals of Statistics by Diego M. Amid Ph. D.
Book Atbp. Publishing Corporation, 2009
Lorimar Publishing Co. Inc., 2005
5.. Introduction to Statistics by Ronald E. Walpole
6. General Statistics by General Antonina Sta. Maria, Salamat, &
Pearson Asia Ltd. 2002 Cabrero. National Book Store., 20110

22

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy