Chapter 5 Statistics and Data
Chapter 5 Statistics and Data
Statistics
NOTE:
The statistical measure or characteristic
obtained by using all the data values from a given
population is called parameter while the statistical
measure or characteristic obtained by using data
values from a sample is called statistic.
Measures of Central Tendency (Location)
1. The Arithmetic Mean. The mean is also commonly called the average. Given a
collection of 𝑛 data set of points, 𝑥1 , 𝑥2 , ⋯ , 𝑥𝑛 , the mean is the sum of the
data values divided by the total number of the data values. The mean for a
sample is denoted by 𝑥ҧ and this statistics is computed as
σ𝑛𝑖=1 𝑥𝑖
𝑥ҧ =
𝑛
𝑥1 +𝑥2 +⋯+𝑥𝑛
=
𝑛
where 𝑛 represents the total number of values in the sample.
Measures of Central Tendency (Location)
Example 1. In a given quiz, 5 students in MMW of 20 students got the test grades of 41,
39, 50, 47, and 44. Find the mean of these test scores.
Example 2. The ages of working student in two sections of MMW class are 18, 20, 17,
17, 18, 19, 19, and 20. Find the mean.
Notations: 𝜇-parameter;
𝑥-statistic
Example 1. The efficiency ratings of seven (7) employees of a certain company are 81,
69, 93, 76, 87, 70, and 95. Find the median.
Example 2. The weights (in kg) of a sample of six (6) female Miss Universe candidates
are 50, 49, 53, 58, 48, and 55. Find the median.
Measures of Central Tendency (Location)
Properties of Median
♣ It is unique (for numerical data)
♣ It can be computed for ordinal, interval or ratio level data.
♣ It is not affected by extreme values since the median uses only the middle value/s.
Measures of Central Tendency (Location)
3. Mode – the data value that occurs more frequent.
Note: A data set can have more than one mode or no mode at all.
Example. Find the mode(s) of the following sets of raw data:
a. 1 9 2 8 3 7 5 9 6
b. 8 2 9 1 5 7 6 3 4
c. 9 2 1 9 3 5 7 1 6
Properties of Mode
♣ It can be computed for any type of data whether it is nominal, ordinal, interval, or
ratio level data.
♣ It may not be unique since sometimes we cannot just get one value, like on the
Example c shown above.
♣ It may not exist like on the Example b shown above.
Measures of Central Tendency (Location)
4. A value called the weighted mean is often used when some data
values are more important than others.
Formula:
Score
43
1st Prelim Exam
95
64
2nd Prelim Exam (Midterm)
100
83
Final Exam (comprehensive)
110
Classroom Performance (quizzes, 436
assignments, etc.) 650
Given that the 1st and 2nd prelim exams is 20% each, final exam is 30% and
classroom performance is 30%, determine if Mary Grace will pass her Math 17
subject if the minimum required percentage is 45%
Measures of Variation (Dispersion)
In the preceding section, we introduced the three measures of central location-
the mean, the median, and the mode.
Machine 1 Machine 2
These three measures cannot provide
an adequate description on how the 9.52 8.01
data spreads or deviates away from 6.41 7.99
the mean. For instance consider a 10.07 7.95
soft-drink dispensing machine that 5.85 8.03
should dispense 8 oz. of your
selection into a cup. Table below 8.15 8.02
shows data for two of these ഥ
𝒙 = 𝟖. 𝟎 ഥ
𝒙 = 𝟖. 𝟎
machines. Table: Soda Dispensed (Ounces)
Measures of Variation (Dispersion)
The data of two machine have an
equal mean which is 8 oz. However,
the quantity of soda dispensed for Machine 1 Machine 2
Machine 1 is very inconsistent-in 9.52 8.01
some cases he soda overflows the
6.41 7.99
cup, and in some cases too little soda
is dispensed. 10.07 7.95
5.85 8.03
Machine 2, on the other hand, is
working fine. The quantity dispensed 8.15 8.02
is very consistent with little variation. ഥ
𝒙 = 𝟖. 𝟎 ഥ
𝒙 = 𝟖. 𝟎
Table: Soda Dispensed (Ounces)
This example shows that average values do not reflect the spread or dispersion of data. To measure
spread or dispersion of data, we introduce statistical values such as the range and the variance.
Measures of Variation (Dispersion)
1. Range. The range of a set of data is the difference between the largest and
smallest number in a data set. That is,
Example. The grade-point average of 20 college seniors selected at random from the
graduating class are as follows:
Parameter: 𝜎 = 𝜎 2 Statistic: 𝑠 = 𝑠 2
Measures of Relative Position (Fractiles)
In addition to measures of central tendency and measures of variation, there are also
measures of position whether it will be at the center or at any points in the distribution
of the data. These measures , often referred to as quantities or fractiles, are values
below which is a specific fraction or percentage of the observations in a given set must
fall. These measures include percentiles, deciles, quartiles and 𝑧-scores.
1. Percentiles. The percentiles are values that divide a set of observations (arranged
increasingly) into 100 equal parts.
We use 𝑃𝑘 𝑘 = 1,2,3, … , 99 to denote the 𝑘 𝑡ℎ percentile such that 𝑘% the
observation falls below it.
Measures of Relative Position (Fractiles)
2. Deciles. Deciles are values that divide the set of observations into 10 equal parts.
➢ It is denoted by 𝐷𝑘 𝑘 = 1,2, … , 9 , such that 𝐷𝑘 =the value such that 10 ⋅ 𝑘%
of the observation falls below it.
3. Quartiles. Quartiles are the values that divide the set of observations into 4
equal parts.
➢ It is denoted by 𝑄𝑘 𝑘 = 1,2,3 such that 𝑄𝑘 = the value such that 25 ⋅ 𝑘% of
the observations fall below it.
Measures of Relative Position (Fractiles)
Steps in computing measures of NCL:
Step 1. Arrange the data in increasing order.
Step 2. Find the location of 𝑘 𝑡ℎ fractile by computing
Step 3.
If 𝐿 is integer, then the desired value is the average of the 𝐿𝑡ℎ and
𝐿 + 1 𝑡ℎ observations.
If 𝐿 is not an integer, round up 𝐿 to the next integer. The desired value is the
observation located to the rounded up value of 𝐿
Measures of Relative Position (Fractiles)
Examples:
1. The number of movies attended last month by a random sample of 12 students
are recorded as follows: 3, 0, 3, 1, 6, 5, 7, 5, 8, 8, 10 and 11.
Find the following: 𝑃48 , 𝐷8 , 𝑄3 .
2. The following table lists the calories per 100 ml of 10 popular sodas. Find the
following:
a. 𝑃48
b. 𝑃70
EXERCISES
For numbers 1-2, find the (a) Mean, (b) Median and (c) Mode.
1. The time spent (in minutes) each student in working the assigned problems are
provided below:
22 25 19 33 31 22 42 25 17
2. The Prelim scores of MMW students are given below:
79 80 49 67 86 91 36 33 77 50 12.
3. Suppose a student made an average score of 65% in the attendance, 60% in the
quizzes, 47% in the first prelim, 48% in the second prelim, and 54% in the final exam.
What is the final average score of the student if the attendance weigh 5% of the
total grade, quizzes 25%, first prelim 20%, second prelim 20%, and final exam weigh
30%. Did the student pass the course if the passing cut-off score is 60%?