0% found this document useful (0 votes)
861 views59 pages

Measures of Relative Standing

Uploaded by

Gracean Maslog
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
861 views59 pages

Measures of Relative Standing

Uploaded by

Gracean Maslog
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

MEASURES OF RELATIVE

FSUU MATHEMATICS DEPARTMENT


MELVIN P. BAHIAN
INSTRUCTOR
Relative
Standing
Unit 1
Learning Outcomes of the Lesson

At the end of the lesson, the learners are able to:

1. describes measures of relative standing;


2. solve z scores, percentiles, quartiles, and box-and-
whiskers-plot; and
3. apply and illustrate their uses.
What is measures of relative
standing?

Key concept: measures of relative standing, which are numbers


showing the location of data values within the same data set.

The most important concept in this section is the z score, which will
be used often.

Percentiles and quartiles, which are common statistics, as well as


another statistical graph called a boxplot, will also be discussed.
Standard
Normal
Distribution
Unit 1.1:
z Scores
What is z-score?
z Score is found by converting a value to a standard scale. It is
the number of standard deviations that a data value is away
from the mean.

z Score is a measure of position, in the sense that it describes


the location of a value (in terms of standard deviations) relative
to the mean.

z Score (or standardized value) is the number of standard


deviations that a given value, X, is above or below the mean.
How to calculate z-score?
In order to calculate the z score of a population we follow the next
formula:

𝑥−𝜇
𝑍=
𝜎
Where:
𝑍! = Z score
𝑥 = to the data value
𝜇 = mean of the data set
𝜎 = standard deviation of the data set (which is a population in this
case)
How to calculate z-score?
In order to calculate the z score of a sample we follow the next
formula:

𝑥 − 𝑥̅
𝑍=
𝑠
Where:
𝑍! = Z score
𝑥 = to the data value
𝑥̅ = mean of the data set
𝑠 = standard deviation of the data set (which is a population in this
case)
Properties of z Score
1. A z score is the number of standard deviations that a
given value x is above or below the mean.
2. z scores are expressed as numbers with no units of
measurement.
3. A data value is significantly low if its z score is less
than or equal to −2 or the value is significantly high if
its z score is greater than or equal to +2.
4. If an individual data value is less than the mean, its
corresponding z score is a negative number.
Example: Comparing a Quarter’s
Weight and Adult Body Temperature
Which of the following two data values is more extreme relative
to the data set from which it came?

A. The 99°𝐹 temperature od an adult (among 106 adults with


sample mean 𝑥̅ = 98.20°𝐹 and sample standard deviation 𝑠 =
0.62°𝐹)

B. The 5.7790𝑔 weight of a quarter (among 40 quarters with sample


mean 𝑥̅ = 5.63930𝑔 and sample standard deviation 𝑠 =
0.06194𝑔
Example: Comparing a Quarter’s
Weight and Adult Body Temperature

The 99°𝐹 body temperature and the 5.7790𝑔 weight of a


quarter can be standardized by converting each of the
quarter to z score:
𝑥 − 𝑥̅ 99°𝐹 − 98.20°𝐹
𝑧! = = = 1.29
𝑠 0.62°𝐹

𝑥 − 𝑥̅ 5.7790𝑔 − 5.63930𝑔
𝑧" = = = 2.26
𝑠 0.06194𝑔
Example: Comparing a Quarter’s
Weight and Adult Body Temperature
The z-score show that the 99°𝐹 body temperature is 1.29
standard deviation above the mean, and the 5.7790𝑔
weight of the quarter is 2.26 standard deviations above
the mean.

Because the weight of the quarter is farther above the


mean, it is the more extreme value. A weight of 5.7790𝑔
of a quarter is more extreme than a 99°𝐹 body
temperature.
Using z Scores to Identify Significant
Values

Significantly Significantly
low values Values not significant high values

−3 −2 −1 0 1 2 3

Significant low values: 𝑧 ≤ −2


Significant high values: 𝑧≥2
Values not significant: −2 < 𝑧 < 2
Example: Is an Earthquake Magnitude of
𝟒. 𝟎𝟏 Significantly High?
Among the earthquakes listed in Data Set 24
“Earthquakes,” one of the stronger earthquakes had a
magnitude of 4.01. The magnitudes are measured on the
Richer scale, and only earthquakes of magnitude 1.0 or
higher are included. The 600 magnitudes in the data set
have a mean of 2.572 and a standard deviation of 0.651.
For this data set, is the magnitude of 4.01 significantly
high?
Example: Is an Earthquake Magnitude of
𝟒. 𝟎𝟏 Significantly High?
Solution: The magnitude of 4.01 is converted to a z-score

𝑥 − 𝑥̅ 4.01 − 2.572
𝑧= = = 2.21
𝑠 0.651

Interpretation: The computed z-score from the given


values is 2.21, which means that the magnitude of 4.01 is
significantly high.
Practice
1. Women’s heights have a mean of 63.6 inches and a
standard deviation of 2.5 inches. Find the z-score
corresponding to a woman with a height of 70 inches
and determine whether the height is unusual.
2. IQ scores have a mean of 100 and a standard
deviation of 16. Albert Einstein reportedly had an IQ of
160.
a. What is the difference between Einstein’s IQ
and the mean?
b. What is the z-score of Albert Einstein’s IQ?
Unit 1.2:
PERCENTILE
What is Percentile?
Percentiles indicate the percentage of data outcomes in a
set which fall under a certain value.

Percentiles divide the whole data set into a hundred equal parts,
when translating this into a distribution graph, the percentiles produce
99 division marks that denote the percentage of data located up to a
certain value. Each of the 99 division marks within the distribution is
what we call a percentile. When looking at a percentile mark on a
specific data value, we can see the percentage of data that is found
below (or up to) that value, therefore, percentiles do not necessarily
lay equally separated on a distribution. (see above figure
How to calculate percentiles?

In order to calculate the percentile of a certain value of 𝑋


from the data set we follow the next equation:

𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑝𝑜𝑖𝑛𝑡𝑠 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝑋


Percentile of 𝑋 = ×100
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑝𝑜𝑖𝑛𝑡𝑠
Example 1

Sidney is taking a biology course in


university. She got a mark of 78% and a
list of all narks from her class (including her
mark) is given by:

56, 83, 74, 67, 47, 54, 82, 78, 86, 90


Example 1

First, we arrange the scores in an


ascending order. That is,

47, 54, 56, 67, 74, 78, 82, 83, 86, 90

Now, solving for the percentile Sidney


scored in, we use the percentile formula:
Example 1
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑝𝑜𝑖𝑛𝑡𝑠 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝑋
Percentile of 𝑋 = ×100
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑝𝑜𝑖𝑛𝑡𝑠
#
= $% ×100
= 50

Therefore, Sidney scored in the 50!"


percentile (or above the 50%).
Example 2
The table lists the fifty “Space Mountain” 10AM wait times
from the Data Set 33 “Disney World Wait Times” arrange
in increasing order. Find the percentile for the wait time of
45 minutes.
10 15 15 15 15 15 20 20 20 20
25 25 25 25 25 25 30 30 30 30
30 30 30 30 35 35 35 35 35 35
35 35 40 40 40 40 45 50 50 50
50 50 55 55 60 75 75 75 105 110
Example 2
From the sorted list of wait times, there are 36 wait times
less than 45 minutes. So,

36
Percentile of 45 = T 100 = 72
50

A wait time of 45 minutes is in the 72&' percentile. This


can be interpreted loosely as this: a wait time of 45
minutes separates the lowest 72% of values from the
highest 28% of values. We have 𝑃() = 45 minutes.
Converting a Percentile to a Data
Value
𝑛 – total number of values in the data set
𝑘 – percentile being used (example: for the 25*+
percentile, 𝑘 = 25)
𝐿 – locator that gives the position of a value (example:
for the 12*+ value in the sorted list, 𝐿 = 12)
𝑃, – 𝑘*+ percentile (example: 𝑃)# is the 25*+ percentile)

𝑘
𝐿= T𝑛
100
Example 3

Refer to example 2, find the value of the


25!" percentile, 𝑃#$ .
10 15 15 15 15 15 20 20 20 20
25 25 25 25 25 25 30 30 30 30
30 30 30 30 35 35 35 35 35 35
35 35 40 40 40 40 45 50 50 50
50 50 55 55 60 75 75 75 105 110
Example 3
We can proceed to compute the value of the locator 𝐿. in
this computation, we use 𝑘 = 25 because we are
attempting to find the value of the 25*+ percentile, and we
use 𝑛 = 50 (from the data set).

So,
,
𝐿 = $%% T 𝑛
)#
= $%%
T 50 = 12.5
Example 3
Since 𝐿 = 12.5 is not whole number, rounding up the
value then 𝐿 = 13.

Then,
𝑃)# = 13𝑡ℎ 𝑣𝑎𝑙𝑢𝑒
P)# = 25

Roughly speaking, about 25% of the wait times are less


than 25 minutes and 75% of the wait times are more than
25 minutes.
Unit 1.3:
QUARTILES
What is Quartile?
Quartile focuses on dividing the data distribution into
four parts, where each quartile is the specific point
marking the division between the first quarter and the
second, the second quarter and the third or the third
quarter and the fourth. In simple words, quartiles are
values that divide a data set into quarters after the
data set has been ordered; each quartile has a name
and they are: 𝑄! , 𝑄" , and 𝑄# .
What is Quartile?
Where:
𝑄! - splits the lowest 25% of the sorted data
𝑄" - Median, splits the lowest 50% of the
sorted data
𝑄# - splits the lowest 75% of the sorted data

The middle 50% of the data in the data set and its proper
distribution comprises the interval named the interquartile
range, which is equal to subtracting the first quartile from
the third quartile.
What is Quartile?
If odd number
%&
𝑄$ = 𝑥#$%
&
If even number
'#('#
& & $%
𝑄$ = "

Where:
𝑘 – quartile values (1, 2, 3)
𝑛 – number of data
Example 1

Find the quartiles for each data set:

a. 9, 3, 7, 5, 2, 8, 12

b. 2, 3, 5, 7, 8, 9, 12, 15

c. 2. 3. 5, 7, 8, 9, 12, 15, 35
Solution (a)

We first find the median, for which you have to


arrange the data values in ascending order to find
the value in the midpoint.

That is,
2, 3, 5, 𝟕, 8, 9, 12
Then,
𝑄) = 7, 𝑄$ = 3, and 𝑄- = 9
Solution (b)

This particular data set has its values already


ordered from lowest to highest, therefore, we just find
the median:

That is,
2, 3, 5, 𝟕, 𝟖, 9, 12, 15
Then,
(./ -.# 0.$)
𝑄) = )
= 7.5, 𝑄$ = )
= 4, and 𝑄- = )
= 10.5
Solution (c)

Data set c is already ordered too, and given that it


has an odd number we can easily find its median:

That is,
2. 3. 5, 7, 𝟖, 9, 12, 15, 35
Then,
-.# $).$#
𝑄) = 8, 𝑄$ = )
= 4, and 𝑄- = )
= 13.5
Statis tics defined using quartiles and
percentiles
Interquartile range : 𝐼𝑄𝑅 = 𝑄T − 𝑄U

V! WV"
Semi-interquartile range : 𝑆𝐼𝑄𝑅 = X

V! YV"
Mid-quartile : 𝑀𝑄 =
X

10-90 Percentile range : 𝑃Z[ − 𝑃U[


Unit 1.4:
Boxplots
What is boxplots (skeletal)?

Boxplot (or a box-and-whisker diagram) is a


graph of a data set that consists of a line
extending from the minimum value to the
maximum value, and a box with lines drawn
at the first quartile, the median, and the
third quartile.
Boxplots (skeletal)

First quartile (𝑄" ) Median (𝑄$ ) Third quartile (𝑄# )

Minimum value Maximum value


5 – Number Summary

For the set of data, the 5-number summary


consists of these five values:

1. Minimum
2. First Quartile, 𝑄U
3. Second Quartile, 𝑄X (Median)
4. Third Quartile, 𝑄T
5. Maximum
Example

10 15 15 15 15 15 20 20 20 20
25 25 25 25 25 25 30 30 30 30
30 30 30 30 35 35 35 35 35 35
35 35 40 40 40 40 45 50 50 50

50 50 55 55 60 75 75 75 105 110
Example

For the set of data above, find the 5-number


summary values:

1. Minimum : 10
2. First Quartile, 𝑄U = 25
3. Second Quartile, 𝑄X = 35 (Median)
4. Third Quartile, 𝑄T = 50
5. Maximum : 110
Example: Construct Boxplot

For the set of data above, the boxplot is shown


below

10 20 30 40 50 60 70 80 90 100 110
Boxplots

Note: because there is no universal agreement on


procedures for finding quartiles, and because
boxplots are based on quartiles, different
technologies may yield different boxplots.

Boxplots give us some information about the


spread of the data
Boxplot
Since shape of boxplot determined by 5 number
summary, a boxplot is not a graph of a distribution of
the data and it does not show as much detailed
information as a histogram or stem plot.

However, boxplots are often great for comparing two


or more data sets.

***Graph in the same scale so comparison can be easily


made)
Skewness

A boxplot can often be used to identify skewness.

A distribution of data is skewed if it is not


symmetric and extends more to one side than to
the other.
Skewness
What is Outliers?

Outliers can strongly affect values of some important


statistics such as mean and the standard deviation.

Outliers can also strongly affect important methods.

***When analyzing data data, always identify the


outliers and consider their effects, which can be
substantial
What is modified boxplots?

Modified boxplot is a regular boxplot constructed


with these modifications:

1. A special symbol (such as an asterisk or point) is


used to identify outliers as defined above, and

2. The solid horizontal line extends only as far as the


minimum data value that is not an outlier and the
maximum data value that is not an outlier.
Constructing a Modified Boxplot

1. Find the quartiles; 𝑄! , 𝑄" , 𝑄#


2. Find the 𝐼𝑄𝑅 = 𝑄# − 𝑄!

3. Evaluate 1.5 X 𝐼𝑄𝑅 to determine the outliers

4. In a modified boxplot a data value is an outlier if it


is:
a. Above 𝑄# > 1.5×𝐼𝑄𝑅
b. Below 𝑄! > 1.5×𝐼𝑄𝑅
Constructing a Modified Boxplot
A special symbol (such as an asterisk or a point) is
used to identify outliers.
First quartile Median (𝑄$ ) Third quartile (𝑄# )
(𝑄" ) Outliers

* * *
Outlier

Minimum value Maximum value


When you do anything that involves
data, some factors to consider:
1. Context of the data 6. Distribution
2. Source of the data 7. Outliers
3. Sampling of the data 8. Changing Patterns
over time
4. Measures of Center 9. Conclusions
5. Measures of Variation 10. Practical
Implications
Properties of a Density Curve

1. A smooth curve
2. Is always on or above the horizontal axis
3. An area of exactly 1 underneath it
4. An area under the curve within range of values is
the proportion of all observations that fall in the
range
Let’s do this
Activity!
Exercises

1. A normal distribution of scores has a standard


deviation of 10. Find the z-scores corresponding to
each of the following values:
a. A score that is 20 points above the mean. z=2
b. A score that is 10 points below the mean. z=-1
c. A score that is 15 points above the mean z=1.5
d. A score that is 30 points below the mean. z=-3
Exercises
2. The Welcher Adult Intelligence Test Scale is composed of a
number of subtests. On one subtest, the raw scores have a mean
of 35 and a standard deviation of 6. Assuming these raw scores
form a normal distribution:
a. What number represents the 65th percentile (what number
separates the lower 65% of the distribution)? 37.31
b. What number represents the 90th percentile? 42.71
c. What is the probability of getting a raw score between 28
and 38? 57%
d. What is the probability of getting a raw score between 41
and 44? 9%
Exercises

3. Scores on the SAT form a normal distribution


with 𝜇 = 500 and 𝜎 = 100.
a. What is the minimum score necessary to be
in the top 15% of the SAT distribution? 604
b. Find the range of values that defines the
middle 80% of the distribution of SAT scores
(372 and 628). Find the z-scores: -1.28,
1.28

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy