0% found this document useful (0 votes)
356 views14 pages

Module 3: Data Analysis and Interpretation

This document discusses measures of central tendency and variation that are used to analyze and interpret data. It introduces concepts like mean, median, mode, range, variance, and standard deviation. The document also discusses how these measures can provide insights when assessing learning outcomes and how different measures are suited for different types of data, whether nominal, ordinal, interval, discrete, or continuous.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
356 views14 pages

Module 3: Data Analysis and Interpretation

This document discusses measures of central tendency and variation that are used to analyze and interpret data. It introduces concepts like mean, median, mode, range, variance, and standard deviation. The document also discusses how these measures can provide insights when assessing learning outcomes and how different measures are suited for different types of data, whether nominal, ordinal, interval, discrete, or continuous.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Module 3: Data Analysis and

Interpretation
Module Overview

This module presents the discussions on the principles, concepts, and approaches in
central tendency and variation measures. It also introduces the principles employed in the
normal curve and the methods and interpretations of the skewness and kurtosis of the curves.

Motivation Question

How do the characteristics and significance of central tendency, variation, normal curve,
skewness, and kurtosis provide educational and philosophical insights into its application in
assessing learning outcomes?

Module Pretest
Instructions: Multiple Choice: Select the statement or phrase that best answers the given
statement. Write the letter only that corresponds to the chosen option.
1. The median of a, y, x, c, f, g, d is
a. y b. f
c. x d. y

2. The mean of ten numbers is 58. If one of the numbers is 40, what is the mean of the
other nine?
a. 40 b. 60
c. 50 d. 80

3. Which measure illustrates the distance between the highest to the lowest?
a. range b. variance
c. standard deviation d. mean deviation

4. Which measure of central tendency is most appropriate if your data is the marital
status?
a. Mean b. Median
c. Mode d. Percentile
5. Which is a median of a discrete data composed of 4, 6, 8, 10, 10, 15, 22, 25?
a. 9 b. 10.5
c. 10 d. 11

6. Given the 100 measure in a line, the 50th measure is equal to

a. median b. P45
c. 3rd quarter d. mean

7. Which curve suggests a homogeneous cluster of students’ scores showing a very close
competition?

a. Platykurtic b. Leptokurtic
c. Mesokurtic d. Normal curve

8. When the items in a test are easy for most of the students, the curve is skewed
a. negatively b. to the right
c. positively d. no skewness

9. The result of a test in science having a mean of 89 and a median of 75 suggest a


skewness that is

a. negative b. zero
c. positive d. undefined

10. Which denotes a median?

a. The 3rd quarter b. The square root of the variance


c. The difference between the 1st d. The 50th percentile
and 3rd quartile

Lesson 3.1: Measures of Central Tendency and Variation

Lesson Summary

Introduced in this lesson are the concepts and statistical principles employed in central
tendency and variation measures. It aims to scale up the knowledge and skills of education
students in the computation and the application of these measures to interpret and evaluate
learning outcomes.

Learning Outcomes
After completing the lesson, the students can:
1. Discuss the characteristics, uses, and limitations of each measure of central tendency.
2. Compute each measure of central tendency and interpret the results.
3. Determine the range, variance, and standard deviation of sets of scores.
4. Interpret the descriptive numerical measures of variation of the given set of data.

Motivation Question

How can we use our knowledge and skills in central location measures and variation in
interpreting and evaluating learning outcomes?

Discussion
The Measures of Central Tendency

The measures of central tendency or concentration location, otherwise known as


measures of averages, are not new to you. You have studied this in statistics and mathematics
in your Junior and Senior High. Can you still recall the mean, median, and mode? Assessment
of learning involves variables and data, and the kinds of variables and data are important
considerations in using the measure of central tendency.

The mean, median, and mode tend to lie centrally within a set of data. Thus, they are
called the measures of central tendency. The frequency distribution displays a normal curve or
symmetrical curve when the mean, median and mode are equal. However, when the mean,
median, and mode are not the same or equal, we have an asymmetrical shape that is either
skewed to the right or left. The mode corresponds to the maximum point or points on the curve.
The median corresponds to the vertical line, which divides the histogram into parts having equal
areas. In general, the mean surpasses the median in positively skewed distribution, whereas the
median exceeds the mean in a negatively skewed distribution.

Variables are characteristics of a certain object and can be nominal, ordinal, ratio, or
interval. Moreover, the true data measure can either be qualitative or quantitative. Further,
quantitative data can either be discrete or continuous. A contingency table taken from
Buenaflor (2012) is presented as a guide to determine the applicability in using average
measures.

Table 3-1 Contingency Table on Applicability of Averages

Variable
Measure
Nominal Interval Ordinal
Discrete Mode Mean, Median, Mean, Median,
and Mode and Mode
Continuous (not possible) Mean, Median, Mean, Median,
and Mode and Mode

The table shows that the mode is the only measure applicable to both nominal and
discrete measures. Whereas, the mean, median, and the mode apply to both interval and
discrete and interval and continuous. Similarly, the mean, median and mode apply to both
ordinal and discrete and ordinal and continuous. But the table indicates that no average
measure is possible for nominal and continuous data. Taking this attribute independently in
assessing learning is without meaning and, thus, illogical.

The Mean

The mean is the most stable of all the measures of central tendencies. This measure
has the characteristics of being the center of all the observations concerning their values or
magnitude. In using this measure, the teacher or the assessor must examine the kind of
variable and the measure at hand. The mean is appropriately applicable to interval measures,
whether discrete or continuous, but not nominal variables. The mean of a set of data varies
depending on the data management used. Likewise, each data management corresponds to a
definite formula in getting the mean. We will focus our study on the measures of central
tendency using ungrouped data.

The ungrouped data may use either the arithmetic mean or the weighted mean. The
formula for the arithmetic mean is X́ = ∑X/N. On the other hand, the formula for the weighted
mean is X́ = ∑fx/∑f.

Example 3-1: Suppose your students’ scores in the first examination are the following: 99, 78,
85, 77, 86, 84, 80, 81, 82, 84, 79, 66, 88, 75. Find the mean.

Solution: X́ = ∑X/N
= (99 +78+85+77+86+84+80+81+82+84+79+66+88+75)/14
X́ = 1,144/14 = 82 (discrete).

Example 3-2: Find the mean of the number of eggs sold by 14 students which are: 20, 25, 18, 20,
20, 15 10, 14, 25, 18, 18, 19, 19, 20.

Solution:
X f fx
25 2 50 X́ = ∑fx/∑f = 261/14
20 4 80
19 2 38 X́ = 18.64
18 3 54 X́ = 19
15 1 15
14 1 14
10 1 10
∑f=14 ∑fx=261

Example 3-3: Suppose in the first semester Mr. Aguja’s grades on his subjects with
corresponding credit units are as follows:

Subject Credit Units (W) Grade (X)


Math 17 4 2.0
PrEd 146 3 2.7
Math 1 3 1.6
Phys 134 4 2.0
Chem 125 5 2.5

Solution: X́ =
∑ XW =
2.0 ( 4 )+2.7 (3 )+ 1.6 (3 )+ 2.0 ( 4 )+2.5 (5) 8+8.1+ 4.8+8+12.5
=
∑W 4 +3+3+ 4+5 19
41.4
= = 2.18 or 2.2
19

The decision whether to round off the answer to a whole number depends on the kind of
the variable being measured. Example number 3.1 is on the number of eggs, which is a discrete
variable. Hence, the correct answer is 19 instead of 18.64 because there is no equivalent of
0.64 of an egg. If the measure stands for a continuous object, then the answer must be 18.64.
In the example, number 3.1, students’ scores are expressed in discrete data. Hence the mean
should be discrete, which is 82 instead of 81.71. In the example, number 3.3, students' grades
are expressed continuously, hence the correct answer is 2.2, a continuous data.

The mean has the following properties:

1. Existence of the mean. In means that you can always compute for the mean of any set
of numerical data.
2. The uniqueness of the mean. That signifies that there is one and only one mean for a set
of numerical data.
3. The means of several sets of data can be combined to form only the mean for all the
data.
4. In getting the mean, every value in the set of data is considered.
5. The mean is the most preferred measure of central tendency because it describes the
balance point of any distribution and uses all values in the data set.
6. The algebraic sum of the deviations of a set of numbers form the arithmetic mean is
zero.
For example, the mean of the numbers 6, 10, 14, 18 is 12. The deviations are as follows:
6–12 = -6; 10–12 = -2; 14–12 = 2; 18 –12 = 6. Therefore, the sum of the deviations: -6 – 2 + 2 +
6=0
The Median
~
The median ( X ¿ of a data set is described as middle score obtained after arranging the
data from the lowest value to the highest value (ascending order) assuming that the number of
cases or data or observations is odd. If the number of cases is even, the median is the average
of the two middlemost scores. In an interval data, the median does not require the measure or
weight of the element. What it requires is the ordinal and normal sequencing from highest to
lowest. For nominal variables, it requires the sequential pattern, for example days in a week and
months, as well as letters in the alphabet, and others.
The median is considered the appropriate measure for nominal data. It only considers
the elements as having equal values and is equidistant from highest to lowest extremes after
their proper ordinal or normal sequential arrangement. It is the central point of a line of all
measures in an ordinal arrangement. This value corresponds to a central point between the
upper 50% and the lower 50% of all measures. This central point is calculated as central point
(cp)= (n+1)/2.
To find the median of ungrouped data, we first arrange the highest to lowest values or
vice versa. Then we pick the middle value when the number of values (N) is odd. However, if N
is even, we add the middle scores and divide the sum by two.
Example 3-4: Suppose the scores of 15 students in a ten-item test are as follows:
6 9 7 10 5 7 4 8 6 3 2 6 9 1 6
Solution:
First you arrange the scores from the lowest value to the highest value, hence we have:

10 9 9 8 7 7 6 6 6 6 5 4 3 2 1

Since there are 15 values (odd number) and the middle value which is n of 15 fall at the
8 number, therefore the second 6 is the median.
th

Example 3-5: Suppose instead of 15 students, 16 took the test and the 16 th student got the
score one (1). Hence, we have.

10 9 9 8 7 7 6 6 6 6 5 4 3 2 1 1

Since there are 16 values (even number), then the median is number falling at the
(n+1)/2, which is (16+1)/2=8.5. Thus, the median is between the second and the third 6s as it is
the number falling at the 8.5 location in the series. The median, being discrete, is 6. However, if
you consider the value as continuous, the median is 6.5 (lower limit of 6 which is 5.5 +1).

Rules for consideration in computing median in ungrouped data:


1. If the number of data in each set is odd and not nominal, the median shall be treated as
continuous.
2. If the data in each set is odd and nominal, it shall be treated as discrete.
3. If the number of data in a given set is even and nominal, no median is possible.
4. If the number of data in a given set is even and not nominal, the median is treated as
continuous.
5. If the number of data in each set is odd or even, and discrete, the median remains
discrete.
6. If the number of data in each set is odd or even continuous, the median remains
continuous.

Example 3-6: Find the median of the following data:

Data Category Median


~
1. a, d, s, m, p nominal X =m
~
2. 3, 5, 8, 9, 12, 15, 20 discrete X =9
~
continuous X = 9.5
~
3. 2, 3, 7, 7, 7, 11 discrete X =7
~
continuous X = 6.83
~
4. 2, 7, 30, 4, 25 Not nominal X =7
~
5. D and J in the alphabet nominal X =G

Notice that a problem may arise in items 2 and 3 as to how the median is computed.
First, you determine if the data is discrete or continuous. For example, in item number 2, if the
measure is discrete, then the median is located at (n+1)/2 or (7+1)/2 = 4 th number in the series;
hence it is 9. But if the data is continuous, the median is 9.5, which is the lower limit of 9 (8.5)
+1.

Similarly, in problem number 3, if the measure is discrete, the median is found at the
(n+1)/2 = (6+1)/2 =3.5 location. That is, the median is found between the first two of the
number 7. Thus, being discrete, the median is 7.

If the value is continuous, the median is computed by getting the upper and lower real
limits falling at the middlemost series within the three 7s, which stands 1/3 of the three 7s. The
lower and upper limits of these three similar numbers are 6.5 and 7.5, respectively. The 1/3 or
0.33 is to be added to the lower limit of 6.5 using interpolation, where the result is the median of
6.83. Another way of solving is by using the upper limit of 7.5, where the median is between the
2nd and the 3rd value of 7, counted from the upper measures. The 2/3 or 0.67 is subtracted
from the upper limit of 7.5 applying interpolation, resulting in 6.83; thus, the median is 6.83.

The Mode
  The mode is the measure of central tendency that does not need any calculation. You
have to pick the value in the set of data that appears most frequently. In a set of data, if there is
one mode, we call the set a unimodal. If there are two modes, we describe the set as bimodal.
When there are three modes, we label it as trimodal. In general, if there is more than one mode,
the set is named as multimodal.  

Example 3-7: 3, 4, 5, 7, 7, 10, 11, 7, 11, 4, 7, 11, 3, 11


In the above example, the modes are 7 and 11, hence the set is bimodal.
Example 3-8: In a survey where the males outnumbered the females which shows that there are
more males than females. Then we can say that “males” is the modal sex.
Example 3-9: The modal blood type is Type O because it is the most common blood type of
people.

Measures of Variation

The sets of data vary to a certain extent. Though two sets have the same mean, still the
spread of the scores vary in some way.

Example 3-10: the scores of two classes in a test in Prof. Ed. Subject.

Class A 2 28 35 33 44 35 25 26 29 28 45
5
Class B 2 29 35 55 40 28 20 26 23 29 40
8

The scores of both the two classes in Prof. Ed. subject have means of both 32.09.
However, if you look closely, it seems that Class B scores are more dispersed than the scores in
Class A.

If the variability is big enough, we can infer two things:


1. The test is discriminatory, which means that only the bright students can obtain the
correct answer, and there are not enough easy questions for poor students.
2. Students in the class are not alike in their capabilities, which means that some are fast
learners, and some are slow learners.
 
Some of the measures of variability or dispersion are the range, variance, and standard
deviation. Now we will discuss it here.

1. The Range

The range (R) is the simplest and easiest measure of dispersion. It is computed as the
difference between the highest and the lowest values of the observations. The bigger the
range's value, the wider the gaps between the values that indicate the more varied the numbers
are. A small value of the range implies a more uniform set of data. However, it does not tell
anything between the highest and lowest values of the observations; hence, it is considered the
least satisfactory dispersion measure.

To find the range of ungrouped data using the data in Example 3-10 on the scores in Prof.
Ed. of the students in Class A, we have:

Rang = highest value-lowest value


e
= 45 - 25
= 20

On the other hand, the range of the scores obtained by Class B is


Range = 55 - 20 = 35
2. Variance
The variance of a set of data (denoted by σ 2 ¿ , is the average of the squared deviations
of the observations from their arithmetic mean. The deviation is defined by:
d= x- x́ ; Where: x = mean and x́ = mean of the
scores

Thus, the variance denoted by σ 2 is defined by the formula:

σ 2=
∑ (x− x́)2
n
However, when the number of values is not too large the following formula is usually
preferred:

σ =
2 ∑ ( x− x́ )2
n−1
If the observations or scores are quite far from the mean, the variance would be large.
Thus, one can say that there is more variability in the data set. If all observations or scores are
the same, it is zero. It means that there is no variability at all in the data set. On the other hand,
if the scores are not all equal but are very close to the mean, it has a small value indicating less
spread or variability in the data set. If the observations or scores are quite far from the mean,
the variance would be large. Thus, one can say that there is more variability in the data set.
Example 3-11: The scores of the students in Algebra are 33, 23, 40, 44, 15, and 25. Compute for
the variance.
Solution:
X X- X́ ( X − X́ )2
33 3 9
23 -7 49
40 10 100
44 14 196
15 -15 225
25 -5 25
X́ =30 ∑( X − X́ )2=604
n=6
Hence,

σ 2=
∑ ( x− x́ )2 = 604
= 120.8
n−1 6−1

Example 3-12: Susan and Lita obtained the following scores in the various quizzes in Statistics.
Compute the variances of their scores.

Susan 50 45 60 50 75
Lita 60 55 56 49 60

Solution: First, let us compute the means:


50+45+60+ 50+75
Susan: x́ = = 56
5
60+55+56+ 49+60
Lita: x́ = = 56
5
Then, let us compute the variances of Susan and Lita

Susan: σ 2=
∑ ( x− x́ )2 = ¿ ¿
n−1
36+121+16+36+361 570
= = = 142.5
4 4
Lita: σ 2 = ∑ ¿¿ ¿ = ¿ ¿
16+1+0+ 49+16
= = 20.5
4
3. The Standard Deviation
The commonly used measure of variation is the standard deviation (sd). The standard
deviation value tells how closely the data set values are clustered around the mean at a uniform
distance. In general, a lower value of the standard deviation for a set of data indicates that the
range of the spread of the observations around the mean is relatively small. On the other hand, a
large value of the standard deviation indicates that the data set's values are scattered over a
relatively wider range around the mean. Moreover, unlike the range, the standard deviation
involves all observations in the distribution. Hence, it is considered the most accurate measure
of dispersion.
The standard deviation denoted by sd is sometimes called the root mean square
because it is obtained by taking the positive square root of the variance calculated for
population data. When the number of observations is small, the standard deviation is obtained
using the formula:
n
sd= √ ∑ ¿¿¿¿
i=1
; where x́ = mean of the data

x = individual observation
n = total number of observations
However, in actual practice, when the sample size is less than 50, the denominator
used is n-1 instead of n.
Example 3-13: Let us compute the standard deviation of the scores obtained by six students in
Algebra (Please refer to Example 3-11).
Solution:

X X- X́ ( X − X́ )2
33 3 9
23 -7 49
40 10 100
44 14 196
15 -15 225
25 -5 25
X́ =30 ∑( X − X́ )2=604

604
sd = √∑ ¿ ¿ ¿ ¿ = √ 6−1
= 10.99

Example 3-14: For our example, let us consider the scores obtained by Susan and Lita in the
various quizzes in Statistics (Please refer to Example 3-12). Let us also
determine who of the two is more consistent in their performance?

Susan 50 45 60 50 75
Lita 60 55 56 49 60

Solution:
a)Susan:
50+45+60+ 50+75
x́ = = 56
5
sd = √∑ ¿ ¿ ¿ ¿ = √ ¿ ¿ ¿
36+ 121+ 16+36+361
sd =
√ 4
= √ 142.5 = 11.94

b) Lita:
60+55+56+ 49+60
x́ = = 56
5
sd = √∑ ¿ ¿ ¿ ¿ = √ ¿ ¿ ¿
16+ 1+ 0+49+ 16
sd =
√ 4
= √ 20.5 = 4.53

Both Susan and Lita have the same mean scores of 56. To find out who is more
consistent in her performance, we computed the standard deviation. A standard deviation of
11.94 units denotes that most of the scores are found within 11.94 units from each side of the
mean. Similarly, a standard deviation of 4.53 means most of the scores are located 4.53 units
from each side of the mean. Since Lita’s scores have a lower standard deviation compared to
Susan’s scores, it means that Lita’s scores are closer to the mean. Therefore, it can be
concluded that Lita’s performance is more consistent than Susan’s performance.

Learning Tasks/Activities
Activity 1. Be sure to allocate time to read and comprehend the contents in the lesson. Once
you have completed your readings, you will make a reflection notes consisting of
summarizing the significant learnings that you get and the insights, reflection, and
your views. The reflection notes should be submitted at the end of the lesson.

Reflection Notes

I learned that . . .
Activity 2: A student’s final grades in ComSci 12, Mathematics, Statistics, Prof. Ed. 11, English,
Chemistry, Physics, Biology, and Earth Science are respectively 90, 83, 85, 88, 85, 81,
83, 80, and 85. The respective credit units for these courses are 4, 3, 3, 3, 3, 4, 5, 4,
and 3, respectively.
a. Compute for the weighted mean grade of the student?
b. If they have the same credit units of 3, what is the student’s mean?
c. What is the modal score? Justify your answer.
d. What is the median score?
e. If you draw a curve of the distribution of the scores, is the curve negatively
skewed or positively skewed or symmetrical? If so, what does it mean?
f. What is the range of the data?
g. Compute for the standard deviation of the scores. Interpret the results.

Assessment
Direction: Answer/Do as directed.
A. Multiple Choice. Select the best answer. Write the letter only that corresponds to the
answer that you have chosen.
1. What is the appropriate measure of central tendency to use when you refer to the
majority frequency of occupants categorized as male or female?
a. mean b. median c. mode
2. Which frequency distribution results to a curve that is skewed to the right?
~ ~ ~
a. X́ = X b. X́ > X c. X́ = X
3. Which measure is applicable to letters in the alphabet measured as nominal and
discrete?
a. mean b. median c. mode
4. The scores in a test obtained a mean of 34 and a standard deviation of 6.2 denotes
a. Most of the scores are found within 6.2 units from each side of the score of 34.
b. Most of the scores are found within 6.2 units below the score of 34.
c. Most of the scores are found within 6.2 units above the score of 34.
5. John and Peter obtained the same mean scores for their midterm performance in
the different subjects. If John’s scores have a standard deviation of 4.2 and Peter
standard deviation is 7.2, who is more consistent in his performance?
a. John b. Peter c. Both
6. P50 of the scores 4, 7, 8, 9, 12 is in what central location?
a. mean b. median c. mode
7. If Tina obtained a percentile rank of 92% in the LET. What does the percentile rank
of 92% means?
a. 92% of all the examinees have scores below Tina’s score.
b. 92% of all the examinees have scores above Tina’s score.
c. 8% of all the examinees have scores below Tina’s scores.
8. If the variability of the scores is “big enough” indicates
a. Some of the students are fast learners and some are slow learners.
b. Most of the students are slow learners.
c. Most of the students are fast learners.
9. Which is a median of the nominal data from c to m?
a. g b. h c. k
10. In a yes or no response to an issue asked, yes occurred more frequently than no;
hence, we say that yes is the ____________ score
a. Mean b. Median c. Modal

B. Modified True or False. Write T if the statement is true and change the underlined word
if the statement is false.
1. A student with a percentile rank of 75% demonstrates that he stands at a point below
25% and above 75% of the 100% score distribution.
2. If the mean is greater than the median, the distribution is negatively skewed.
3. The median corresponds to the 4th decile.
4. The deviations of the numbers 21, 2, -4, 5, and -15 to 7 is -22.
5. If the arithmetic mean is 12 and number of observations are 20, then the sum of all
values is 240.
6. A value of the range shows the number of values between the highest and lowest
scores.
7. A small value of the variance indicates scores that are very near to the mean.
8. The median of the continuous scores 4, 8, 10, 14, 20, 21, 22 is 14.5.
9. In a symmetrical curve, the measures of central tendencies are equal.
10. There is no median in the nominal data of a, c, f and g.

C. Briefly but substantially answer the following problems in your own words.
1. What do you understand about the terms “median” and ‘mode?”
2. How many deciles and quartiles are there in a median? Justify your answer.
3. What do you understand about the term “variation”?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy