MMW Finals Reviewer
MMW Finals Reviewer
Graphical Representation
Graphs - It is another way to visually show the behavior of data. To create a graph, distribution of scores
must be organized. For instance, in the scores provided below, presenting the scores in an unorganized
manner can provide confusing or no information at all; Reporting raw can even hide some significant
scores to be noticed.
But when we arrange the scores from highest to lowest, which is a form of score distribution, some of
pieces of information can gradually brought forth and exposed.
Distribution of Scores
120
110
105
105
100
100
95
90
90
90
85
85
80
75
65
The score distribution can still be organized in a form of a frequency distribution. Frequency distribution
provides information about raw scores, and the frequency of occurrences. Frequency distribution
provides clearer insights about the behavior of scores.
Another alternative way of presenting data in frequency distribution is to present them in a tabular form.
A tabular form has the advantage of showing the visual representation of the data. This kind of
presentation is more appealing to the general audience.
Another way of showing the data in graphical form is by using Microsoft excel, as also illustrated in the
graphs below. It is the frequency polygon of the scores in our cited example above.
Notice in the illustration of the frequency polygon, the two graphs may appear different but they are
actually the same and they disclose the similar information.
MEASURES OF CENTRAL TENDENCY
Measures of central tendency are methods that can used to determine information regarding average,
ranking, and category of any data distribution. Mean, Median and mode are the three tools in obtaining
the measures of central tendency. But only by knowing and using the appropriate tool that most
accurate estimation of centrality can be achieved. The objective of the measures of central tendency is
to describe the centrality of the distribution into a single numerical unit. This single numerical unit must
provide clear description about the common trait being observed in the distribution of scores.
The Mean
The most widely used measure of central tendency is the mean (x̄). It is the arithmetic average of all
the scores. The mean can be determined by adding all the scores together and then by dividing by the
total number of scores. The basic formula for the mean is as follows:
x̄ - mean
In the example below concerning the annual income of 12 workers, the mean can be found by
calculating the average score of the distribution.
In this example, the mean is an appropriate measure of central tendency because the distribution is
fairly well-balanced. This means that there are no extremely high or extremely low scores in either
direction that can unusually influence the average of the scores. Thus, the mean value of 190,083.00
represents the total picture of the distribution (i.e. annual incomes). This means that in a “more or less”
or approximate fashion it describes the entire distribution.
Mean of Skewed Distribution
There are situations wherein the mean cannot be trusted to provide a measure of central tendency
because it portrays an extremely distorted picture of the average value of a distribution scores.
The Median
The median is the point that separates the upper half from the lower half of the distribution. It is the
middle point or midpoint of any distribution. If the distribution is made up of an even number of scores,
the median can be found by determining the point that lies halfway between the two middlemost
scores.
As you observed, even with the presence of extreme score at the high end of the distribution - the value
of the median is still undisturbed.
The Mode
Another measure of tendency is called the mode. It is the most frequently occurring score in a
distribution. In histogram, the mode is always located beneath the tallest bar.
The best way to illustrate the comparative applicability of the mean, median and mode is to look again at
the skewed distribution.
Distribution of monthly income per household in a certain municipality.
Most income is always skewed to the right because the low end has a fixed limit of zero while the high
end has no limit. If we consider that the area of the curve is 100%, then the median is the exact
midpoint of the distribution. The area below and above the median is both equal to 50 percent. Thus, if
the median income is 20,000.00 this means that 50% of the households have an income below
20,000.00 and 50% of the households have an income above 20,000.00. On the other hand, the mean in
our figure above indicates a high income of 100,000.00. This makes the curve positively skewed. The
value of the mean gives a distorted picture of reality. The value of the mean is being unduly influenced
by few affluent income earners at the high end of the curve whose monthly income is almost around
500,000.00. Looking at the modal income, which is 10,000.00 per month, seemed also to distort the
reality towards the low side. The mode is always the highest point of the curve. In this example, the
mode represents the most frequently-earned income; it is far lower than the median income of
20,000.00. Both the mean and the mode give a false portrait of distribution typicality and the truth lies
somewhere in between.
Measures of Dispersion
The measures of central tendency only provide information about the similarity or typicality of scores.
But to fully describe the distribution, we need to gain information about how scores differ or vary. The
description of the distribution can only be complete if some information of its variability is known. To
substantiate the information provided by the measures of centrality, some degree of dispersion must
also be brought into the light.
Measures of Variability
There are three measures of variability: the range, the standard deviation, and the variance. These
three measures give information about the spread of the scores in a distribution. Metaphorically,
variability assert that a glass half-full is also half empty. Being half-full is about centrality and being half-
empty is about variability.
The Range
The range, symbolized by R, describes the variability of scores by merely providing the width of the
entire distribution. The range can be found by simply determining the difference between the highest
score and the lowest score. This difference always has a single value answer.
The example below shows the calculation of the range from a distribution of annual incomes:
The capability of the range is to give information about the scattering of the scores by merely using two
extreme points. Outliers directly influence the range because they determine the highest and lowest
values in the dataset. A single outlier can drastically increase the range, making it less reliable measure
of variability for datasets with outliers.
The standard deviation (SD) is the life-blood of the variability concept. It provides measurement about
how much all of the scores in the distribution normally differ from the mean of the distribution. Unlike
the range, which utilizes only two extreme scores, SD employs every score in the distribution. It is
computed with reference to the mean (not the median or the mode) and it requires that the scores must
be in interval form.
A distribution with small standard deviation shows that the trait being measured is homogenous. While
a distribution with a large standard deviation is indicative that the trait being measured is
heterogeneous. A distribution with same standard deviation implies that the scores are all the same
(i.e. 10, 10, 10, 10).
It is important to note that if all scores are the same, there is no dispersion, no deviation, and no
scattering of scores in the distribution---so much so that there can never be less than zero variability.
The formula simply states that the standard deviation (SD) is equal to the square root of the difference
between the sum of raw score squared, which is divided by the number of cases, and the mean squared
(Sprinthall, 1994).
The Variance
Variance is another technique for assessing disparity in a distribution. In the simplest sense, variance is
the square of the standard deviation. The formula is illustrated below:
Conceptually, variance is the same as standard deviation. If both standard deviation and variance
manifest large values then it means heterogeneous distribution and when they both manifest small
values, they provide similar outcomes about the homogeneity of the distribution.
While standard deviation finds out how to spread out the distribution scores from the mean by
exploring the square root of the variance, the variance, on the other hand, calculates the average
degree by which the score differs from the mean.