0% found this document useful (0 votes)
99 views14 pages

GE MODMAT Unit 4 Statistics 1

This document discusses statistical concepts including descriptive statistics, measures of central tendency, measures of variation, and measures of relative position. It provides examples and explanations of statistical terms like mean, median, mode, range, standard deviation, variance, z-scores, and how to calculate them.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views14 pages

GE MODMAT Unit 4 Statistics 1

This document discusses statistical concepts including descriptive statistics, measures of central tendency, measures of variation, and measures of relative position. It provides examples and explanations of statistical terms like mean, median, mode, range, standard deviation, variance, z-scores, and how to calculate them.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Gov. Alfonso D.

Tan College
Maloro, Tangub City, Misamis Occidental 7214
www.gadtc.edu.ph

UNIT 4: Statistics

Unit Intended Learning Outcomes


At the end of this unit, you should be able to:
1. Utilize variety of statistical tools to process and manage numerical data.
2. Use the methods of linear regression and correlations to predict the value of a
variable given certain conditions.

Discussion
DATA MANAGEMENT

Statistics involves the collection, organization, summarization, presentation, and


interpretation of data.

2 Branches of Statistics
1. Descriptive Statistics involves the collection, organization, summarization, and
presentation of data.

2. Inferential Statistics involves the interpretation and drawing of conclusion from


the data.

Population – the entire group under consideration.


Sample – any subset of the population.

MEASURES OF CENTRAL TENDENCY


This is to locate the center from a set of data.

3 Measures of Central Tendency (3 M’s)


1. Mean is the sum of data values divided by the number of data values. The mean
of n numbers is the sum of the numbers divided by n.

Example: Solution:
Course Grades Mean
Page 1
Mathematics in the Total of Grades
87 ¿
Modern World Number of Courses
518
Contemporary World 84 ¿
6
Ethics 85
PATH-FIT 95 = 86.33 ⃪ Mean
Purposive 82
Communication
ROTC 85
Total 518

Weighted Mean is often used when some data values are more important than others.
It is the sum of the products formed by multiplying each number by its assigned weight
and is divided by the sum of all the weights.

Example:
Course Grades Units
Mathematics in the 3
87
Modern World
Contemporary World 84 3
Ethics 85 3
PATH-FIT 95 1
Purposive 82 3
Communication
ROTC 85 3
Total 16

Solution:
Course Grades × Units Weighted Mean
Mathematics in the 261 Total of Grades ×Units
¿
Modern World Total of Units
1,364
Contemporary World 252 ¿
16
Ethics 255
PATH-FIT 95 = 85.25 ⃪ Weighted Mean
Purposive 246
Communication
ROTC 255
Total 1,364

2. The median of a ranked list of n numbers is: (a) the middle number if n is odd, or
(b) the mean of the two middle numbers if n is even.
Page 2
Example:
Find the median of the data in the following lists.

a. 4, 8, 1, 14, 9, 21, 12
b. 46, 23, 92, 89, 77, 108

Solution
a. The list 4, 8, 1, 14, 9, 21, 12 contains 7 numbers. The median of a list of data
with an odd number of entries is found by ranking the numbers and finding the
middle number.

1, 4, 8, 9, 12, 14, 21

The middle number is 9. Thus 9 is the median.

b. The list 46, 23, 92, 89, 77, 108 contains 6 numbers. The median of a list of data
with an even number of entries is found by ranking the numbers and computing
the mean of the two middle numbers. Ranking the numbers from smallest to
largest gives

23, 46, 77, 89, 92, 108

The two middle numbers are 77 and 89. The mean of 77 and 89 is 83. Thus 83
is the median of the data.

3. Mode refers to the number that occurs most frequently in a list of numbers.

Example:
Find the mode of the data in the following lists.

a. 18, 15, 21, 16, 15, 14, 15, 21


b. 2, 5, 8, 9, 11, 4, 7, 23
c. 5, 3, 7, 4, 5, 7, 4, 3

Solution:
a. In the list 18, 15, 21, 16, 15, 14, 15, 21, the number 15 occurs more often than
the other numbers. Thus 15 is the mode.
b. Each number in the list 2, 5, 8, 9, 11, 4, 7, 23 occurs only once. Because no
number occurs more often than the others, there is no mode.
c. Each number in the list 5, 3, 7, 4 occurs twice. Because no number occurs more
often than the others, there is no mode.
Page 3
MEASURES OF VARIATION/DISPERSION
A measure of variability of a set of data is a number that conveys the idea of spread for
the data set.

Measures of Variation
1. Range is the difference between the greatest data value and the least data
value.

Examples:
a. Find the range of the following group of numbers: 10, 12, 5, 16, 7, 13, 4.
b. A data set with 10 numbers: 199, 145, 123, 167, 145, 191, 182, 178, 162, and
151. What is the range?

Solution:
a. The highest number is 16, and the lowest number is 4, so 16 – 4 = 12.
Therefore, the range is 12.

b. The highest number is 199, and the lowest number is 123, so 199 – 123 = 76.
Therefore, the range is 76.

2. Standard Deviation makes use of the amount by which each individual data
value deviates from the mean.
3. Variance is the square of the standard deviation of the data.

Population Standard Deviation

Example: Find the Standard Deviation and Variance


The following numbers were obtained by sampling population.

2, 4, 7, 12, 15

Find the standard deviation and variance of the sample.

Solution:

Step 1: The mean of the numbers is

2+4 +7+12+15 40
x= = =8
5 5

Page 4
Step 2: For each number, calculate the difference between the number and the
mean.
Step 3: Calculate the square of each deviation in Step 2, and find the sum of these
squared deviations.

𝑥 𝑥- x (𝑥 - x )2
2 2 – 8 = -6 (-6)2 = 36
4 4 – 8 = -4 (-4)2 = 16
7 7 – 8 = -1 (-1)2 = 1
12 12 – 8 = 4 42 = 16
15 15 – 8 = 7 72 = 49
Sum of squared deviations ⟶ 118

Step 4: Because we have a sample of n =5values, divide the sum 118 by n – 1, which
is 4.
118
s2 ¿ 4 =29.5
Step 5: The standard deviation of the sample is s= √29.5 . To the nearest hundredth, the
standard deviation is s=5.43.

Step 6: From step no. 4, variance is s2 = 29.5.

MEASURES OF RELATIV POSITION

z-score

The z-score for a given data value x is the number of standard deviations that x is
above or below the mean of the data. The following formulas show how to calculate
the z-score for a data value x in a population and in a sample.
x−μ x−x
Population: z x = σ Sample: z x = s

Question: What does a z-score of 3 for a data value represent? What does a z-score
of −1 for a data value represent?

Answer: A z-score of 3 for a data value x means that x is 3 standard deviations above
the mean. A z-score of −1 for a data value x means that x is 1 standard deviation below
the mean.
Page 5
Example 1:
Raul has taken two tests in his chemistry class. He scored 72 on the first test, for
which the mean of all scores was 65 and the standard deviation was 8. He received a
60 on a second test, for which the mean of all scores was 45 and the standard
deviation was 12. In comparison to the other students, did Raul do better on the first
test or the second test?

Solution:
Find the z-score for each test.
72−65 60−45
z 72= =0.875 z 60= =1.25
8 12
Raul scored 0.875 standard deviation above the mean of the firs test and 1.25
standard deviations above the mean on the second test. These z-scores indicate that,
in comparison to his classmates, Raul scored better on the second test than he did on
the first test.

Example 2:
A consumer group tested a sample of 100 light bulbs. It found that the mean life
expectancy of the bulbs was 842 h, with a standard deviation of 90. One particular light
bulb from the DuraBright Company had a z-score of 1.2. What was the life span of this
light bulb?

Solution:
Substitute the given values into the z-score equation and solve for x .
x−x
zx=
s
x−842
1.2= , z x =1.2 , x=842 , s=90
90
108=x−842
950=x
The light bulb had a life span of 950 h.

Percentiles
pth Pecentile

A value x is called pth percentile of a data set provided p % of the data values are
less than x .
Page 6
Example 1:
In a recent year, the median annual salary for a physical annual salary for a physical
therapist was P3,724,000. If the 90th percentile for the annual salary of a physical
therapist was P5,295,000, find the percent of physical therapists whose annual salary
was
a. More than P3,724,000.
b. Less than P5,295,000.
c. Between P3,724,000 and P5,295,000.

Solution:
a. By definition, the median is the 50th percentile. Therefore, 50% of the physical
therapists earned more than P3,724,000 per year.
b. Because P5,295,000 is the 90th percentile, 90% of all physical therapists made
less than P5,295,000.
c. From parts a and b, 90 %−50 %=40 % of the physical therapists earned between
P3,724,000 and P5,295,000

Percentile for a given Data Value

Given a set of data and a data value x ,


number of datavalues less than x
Percentile of score x= ∙ 100
total number of datavalues

Example 1:
Ona reading examination given to 900 students, Elaine’s score of 602 was higher than
the scores of 576 of the students who took the examination. What is the percentile for
Elaine’s score?

Solution:
number of data values less than 602
Percentile= ∙ 100
total number of data values
576
¿ ∙ 100
900
¿ 64
Elaine’s score of 602 places her at the 64th percentile.

Quartiles
The three numbers Q1 ,Q2 , and Q3 that partition a ranked data set into four
(approximately) equal groups are called the quartiles of the data. The quartile Q1 is
called the first quartile. The quartile Q2 is called the second quartile. It is the median of
Page 7
the data. The quartile Q3 is called the third quartile. The following method of finding
quartiles makes use of medians.

The Median Procedure for Finding Quartiles

1. Rank the data.


2. Find the median of the data. This is the second quartile, Q2.
3. The first quartile, Q1, is the median of the data values less than Q2. The third
quartile, Q3, is the median of the data values greater than Q2.

Example:
The following table lists the calories per 100 milliliters of 25 popular sodas. Find the
quartiles for the data.

Calories, per 100 milliliters, of Selected Sodas


43 37 42 40 53 62 36 32 50 49

26 53 73 48 45 39 45 48 40 56

41 36 58 42 39

Solution:
Step 1: Rank the data as shown in the following table.

1) 26 2) 32 3) 36 4) 36 5) 37 6) 39 7) 39 8) 40 9) 40

10) 41 11) 42 12) 42 13) 43 14) 45 15) 45 16) 48 17) 48 18) 49

19) 50 20) 53 21) 53 22) 56 23) 58 24) 62 25) 73

Step 2: The median of these 25 data values has a rank of 13. Thus the median is 43.
The second quartile Q2 is the median of the data, so Q2=43.

Step 3: There are 12 date values less than the median and 12 data values greater
than the median. The first quartile is the median of the data values less than the
median. Thus Q1 is the mean of the data values with ranks of 6 and 7.
39+39
Q 1= =39
2
The third quartile is the median of the data values greater than the median. Thus Q3 is
the mean of the data values with ranks of 19 and 20.

Page 8
50+53
Q 3= =51.5
2

For further reference, open the link below:


https://www.slideshare.net/topengpogi/measures-of-position-72009187

LINEAR REGRESSION AND CORRELATION

Least-Square Line. Bivariate data are data given as ordered pairs. The least-squares
regression line, or least-squares line, for a set of bivariate data is the line that
minimizes the sum o the squares of the vertical deviations from each data point to the
line. The equation of the least-squares line for the n ordered pairs
( x 1 , y 1 ) , ( x 2 , y 2 ) , ( x 3 , y 3 ) , … , ( x n , y n ) is ^y =ax+ b, where

n ∑ xy−( ∑ x ) ( ∑ y )
a= and b= y−a x .
n ( ∑ x 2) −( ∑ x )
2

The equation of the least-squares line can be used to predict the value of one variable
when the value of the other variable is known.

Linear Correlation Coefficient. The linear correlation coefficient r measures the


strength of a linear relationship between two variables. The closer |r| is to 1, the
stronger the linear relationship is between the variables. For the n ordered pairs
( x1 , y1 ) , ( x2 , y2 ) , ( x3, y3) , … ,( xn, yn) ,

n∑ xy −( ∑ x )( ∑ y )
r=
√ [ n ∑ x −( ∑ x ) ] ¿ ¿ ¿
2 2

Open the links below for the readings.


http://educ.jmu.edu/~drakepp/FIN360/readings/Regression_notes.pdf

https://www.powershow.com/view4/621ad4-NTJmN/
Linear_Regression_and_Correlation_powerpoint_ppt_presentation

Tasks
Page 9
Task 1. Directions: Read and answer the following as directed.

1. The following table displays the ages of female actors when they starred in their
Oscar-winning Best Actor performances.
Ages of Best Female Actor Award Recipients, Academy Awards, 1980-2015
41 33 31 74 33 49 38 61 21 41 26 80

42 29 33 36 45 49 39 34 26 25 33 35

35 28 30 29 61 32 33 45 66 25 46 55

Find the mean and the median for the data in the table. Round to the nearest
tenth.

2. A professor grades each student on 4 tests, a term paper, and a final


examination. Each test counts as 15% of the course grade. The term paper
counts as 20% of the course grade. The final examination counts as 20% of the
course grade. Alan has test scores of 80, 78, 92, and 84. Alan received an 84 on
his term paper. His final examination score was 88. Use the weighted mean
formula to find Alan’s average for the course. Hint: The sum of all weights is
100%=1.
3. Find the mean, the median and all modes for the data in the given frequency
distribution.
a. Points Scored by Lynn

Points scored
in a basketball Frequency
game
2 6
4 5
5 6
9 3
10 1
14 2
19 1

b. Quiz Scores

Scores on a
Frequency
MODMAT Quiz
2 1
Page 10
4 2
6 7
7 12
8 10
9 4
10 3

c. Ages of Science Fair Contestants

Age Frequency
7 3
8 4
9 6
10 15
11 11
12 7
13 1
4. The fuel efficiency, in miles per gallon, of 10 small utility trucks was measured.
The results are recorded in the table below.
Fuel Efficiency (mpg)
22 25 23 27 15 24 24 32 23 22 25 22

Find the mean and sample standard deviation of these data. Round to the
nearest hundredth.
5. All of the numbers in a sample are the same number. What is the standard
deviation of the sample?
6. If two samples both have the same standard deviation, are the samples
necessarily identical?

MEASURE OF RELATIVE POSITION

1. A data set has a mean of x=75 and a standard deviation of 11.5. Find the z-score
for each of the following:
a. x=85 b. x=95
c. x=50 d. x=75
2. A data set has a mean of x=6.8 and a standard deviation of 1.9. Find the z-score
for each of the following:
a. x=6.2 b. x=7.2

Page 11
c. x=9.0 d. x=5.0
3. Blood Pressure. A blood pressure test was given to 450 women ages 20 to 36.
It showed that their mean systolic blood pressure was 119.4 mm Hg, with a
standard deviation of 13.2 mm Hg.
a. Determine the z-score, to the nearest hundredth, for a woman who had a
systolic blood pressure reading of 110.5 mm Hg.
b. The z-score for one woman was 2.15. What was her systolic blood pressure
reading?
4. Test Scores. Which of the following three test scores is the highest relative
score? Show your solution.
a. A score of 65 on a test with a mean of 72 and a standard deviation of 8.2.
b. A score of 102 on a test with a mean of 130 and a standard deviation of 18.5.
c. A score of 605 on a test with a mean of 720 and a standard deviation of 116.4.
5. Reading Test. On a reading test, Shaylen’s score of 455 was higher than the
scores of 4256 of the 7210 students who took the test. Find the percentile,
rounded to the nearest percent, for Shaylen’s score.
6. Placement Exams. On a placement examination, Rick scored lower than 12010
of the 12,860 students who took the exam. Find the percentile, rounded to the
nearest percent, for Rick’s score.
7. Test Scores. Rene scored at the 84th percentile on a test given to 12,600
students. How many students scored higher than Rene?
8. Commute to School. A survey was given to 18 students. One question asked
about the one-way distance the student had to travel to attend college. The
results, in miles, are shown in the following table. Use the median procedure for
finding the quartiles to find the first, second and third quartiles for the data.

Miles Travelled to Attend College


12 18 4 5 26 41 1 8 10
10 3 28 32 10 85 7 5 15

LINEAR REGRESSION AND CORRELATION

1. Given the bivariate data:


x 3 4 5 6 7
y 2 3 3 5 5
a. Draw a scatter diagram for the data.
b. Find n, ∑ x ,∑ y , ∑ x 2 , ( ∑ x )2 , and ∑ xy .

Page 12
c. Find a , the slope of the least-squares line, and b , the y-intercept of the least-
squares line.
d. Draw the least-squares line on the scatter diagram from part a.
e. Is the point ( x , y ) on the least-squares line?
f. Use the equation of the least-squares line to predict the value of y when x=7.3.
g. Find, to the nearest hundredth, the linear correlation coefficient.
2. Paleontology. The following table shows the length, in centimeters, of the
humerus and the total wingspan, in centimeters, of several pterosaurs, which are
extinct flying reptiles.
Pterosaur Data

Humerus Wingspan
24 600
32 750
22 430
17 370
13 270
4.4 68
3.2 53
1.5 24
20 500
27 570
15 300
15 310
9 240
4.4 55
2.9 50
a. Find the equation of the least-squares line for the data. Round the constants
to the nearest hundredth.
b. Use the equation from part a to determine, to the nearest centimeter, the
projected wingspan of a pterosaur if its humerus is 54 cm.
3. Life Expectancy. The average remaining lifetimes for men of various ages in
United States are given in the following table.

Average Remaining Lifetimes for Men

Age Years
0 74.9
15 60.6
35 42.0
Page 13
65 16.8
75 10.2
a. Find the linear correlation coefficient for the data.
b. On the basis of the value of the linear correlation coefficient, would you
conclude, at the |r|>0.9 level, that the data can be reasonably modeled by a
linear equation? Explain.

Page 14

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy