0% found this document useful (0 votes)
455 views64 pages

Engineering Data Analysis

Statistics is the methodology for collecting, analyzing, and drawing conclusions from data. It involves determining what data to collect, organizing and summarizing the data, analyzing it to draw conclusions, and assessing the strength and uncertainty of those conclusions. Statistics has applications in many fields and is used to study topics like medical treatments, consumer behavior, social attitudes, and more. Population refers to all individuals or items under consideration, while a sample is the subset of individuals from which data is collected. Descriptive statistics organizes and summarizes sample data, while inferential statistics uses sample data to make conclusions about the larger population. The goal of statistical analysis is to gain understanding from data by defining variables, collecting data, organizing and summarizing it, and

Uploaded by

Vincent Iluis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
455 views64 pages

Engineering Data Analysis

Statistics is the methodology for collecting, analyzing, and drawing conclusions from data. It involves determining what data to collect, organizing and summarizing the data, analyzing it to draw conclusions, and assessing the strength and uncertainty of those conclusions. Statistics has applications in many fields and is used to study topics like medical treatments, consumer behavior, social attitudes, and more. Population refers to all individuals or items under consideration, while a sample is the subset of individuals from which data is collected. Descriptive statistics organizes and summarizes sample data, while inferential statistics uses sample data to make conclusions about the larger population. The goal of statistical analysis is to gain understanding from data by defining variables, collecting data, organizing and summarizing it, and

Uploaded by

Vincent Iluis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 64

ENGINEERING

DATA
ANALYSIS
WHAT IS STATISTICS?
Statistics is a very broad subject, with applications in
a vast number of different fields. In generally one can say
that statistics is the methodology for collecting, analyzing,
interpreting and drawing conclusions from information.
Putting it in other words, statistics is the methodology
which scientists and mathematicians have developed for
interpreting and drawing conclusions from collected data.
Everything that deals even remotely with the collection ,
processing, interpretation and presentation of data
belongs to the domain of statistics, and so does the
detailed planning of that precedes all these activities.
Definition 1.1 (Statistics). Statistics consists of a body of
methods for collecting and analyzing data.
(Agresti & Finlay, 1997)
From above, it should be clear that statistics is much
more than just the tabulation of numbers and the
graphical presentation of these tabulated numbers .
Statistics is the science of gaining information from
numerical and categorical data . Statistical methods can be
used to find answers to the questions like:
• What kind and how much data need to be collected?
• How should we organize and summarize the data?
• How can we analyze the data and draw conclusions
from it?
• How can we assess the strength of the conclusions
and evaluate their uncertainty ?
Furthermore , statistics is the science of dealing with
uncertain phenomenon and events. Statistics in practice is
applied successfully to study the effectiveness of medical
treatments, the reaction of consumers to television
advertising, the attitudes of young people toward sex and
marriage, and much more. It’s safe to say that nowadays
statistics is used in every field of science.
Example 1.1 (Statistics in practice). Consider the
following problems:
–agricultural problem: Is new grain seed or fertilizer more
productive?
–medical problem: What is the right amount of dosage of
drug to treatment?
–political science: How accurate are the gallups and
opinion polls?
–economics: What will be the unemployment rate next
year?
–technical problem: How to improve quality of product?
POPULATION AND SAMPLE

Population and sample are two basic concepts


of statistics. Population can be characterized as
the set of individual persons or objects in which an
investigator is primarily interested during his or
her research problem. Sometimes wanted
measurements for all individuals in the population
are obtained, but often only a set of individuals of
that population are observed; such a set of
individuals constitutes a sample. This gives us the
following definitions of population and sample.
Definition 1.2 (Population). Population is the collection of all
individuals or items under consideration in a statistical study.
(Weiss, 1999)

Definition 1.3 (Sample). Sample is that part of the population


from which information is collected.
(Weiss, 1999)
Always only a certain, relatively few, features of
individual person or object are under investigation at the
same time. Not all the properties are wanted to be
measured from individuals in the population. This
observation emphasize the importance of a set of
measurements and thus gives us alternative definitions of
population and sample.
Definition 1.4 (Population). A (statistical) population is the set of
measurements (or record of some qualitive trait) corresponding to
the entire collection of units for which inferences are to be made.
(Johnson & Bhattacharyya,1992)

Definition 1.5 (Sample). A sample from statistical population is the


set of measurements that are actually collected in the course of an
investigation.
(Johnson & Bhattacharyya, 1992)
When population and sample is defined in a way of
Johnson & Bhattacharyya, then it’s useful to define the
source of each measurement as sampling unit , or simply,
a unit.

The population always represents the target of an


investigation. We learn about the population by sampling
from the collection. There can be many different
populations , following examples demonstrates possible
discrepancies on populations.
Example 1.2 (Finite population). In many cases the population under
consideration is one which could be physically listed. For example:

–The students of the University of Tampere,


–The books in a library.

Example 1.3 (Hypothetical population). Also in many cases the


population is much more abstract and may arise from the phenomenon
under consideration.

Consider e.g. a factory producing light bulbs. If the factory keeps


using the same equipment, raw materials and methods of production
also in future then the bulbs that will be produced in factory constitute a
hypothetical population. That is, sample of light bulbs taken from current
production line can be used to make inference about qualities of light
bulbs produced in future.
DESCRIPTIVE AND INFERENTIAL
STATISTICS
There are two major types of statistics. The branch of statistics
devoted to the summarization and description of data is called
descriptive statistics and the branch of statistics concerned with
using sample data to make an inference about a population of
data is called inferential statistics.

Definition 1.6 (Descriptive Statistics). Descriptive statistics


c o n s i s t o f me t h o d s f o r o r g a n i z i n g a n d s u mma r i z i n g i n f o r ma t i o n
( We i s s , 1 9 9 9 )

Definition 1.7 (Inferential Statistics). Inferential statistics consist


o f me t h o d s f o r d r a w i n g a n d me a s u r i n g t h e r e l i a b i l i t y o f
c o n c l u s i o n s a b o u t p o p u l a t i o n b a s e d o n i n f o r ma t i o n o b t a i n e d f r o m
a s a mp l e o f t h e p o p u l a t i o n . ( We i s s , 1 9 9 9 )
Descriptive statistics includes the construction of
graphs, charts, and tables , and the calculation of various
descriptive measures such as averages, measures
of variation, and percentiles. In fact, the most part of this
course deals with descriptive statistics.

Inferential statistics includes methods like point


estimation, interval estimation and hypothesis testing
which are all based on probability theory.
E xa m p l e 1 . 4 ( D e s c r i p t i v e a n d I n f e r e n t i a l S t a t i s t i c s ) . C o n s i d e r
event of tossing dice. The dice is rolled 100 times and the results
are forming the sample data. Descriptive statistics is used to
grouping the sample data to the following table.

Inferential statistics can now be used to verify whether the


dice is a fair or not.
Descriptive and inferential statistics are interrelated. It is
almost always necessary to use methods of descriptive statistics
to organize and summarize the information obtained from a
sample before methods of inferential statistics can be used to
make more thorough analysis of the subject under investigation .
Furthermore, the preliminary descriptive analysis of a sample
often reveals features that lead to the choice of the appropriate
inferential method to be later used.

Sometimes it is possible to collect the data from the whole


population. In that case it is possible to perform a descriptive
study on the population as well as usually on the sample. Only
when an inference is made about the population based on
information obtained from the sample does the study become
inferential.
STATISTICAL DATA ANALYSIS
The goal of statistics is
to gain understanding from
data. Any data analysis should
contain following steps:
VARIABLES
A characteristic that varies from one person or thing
to another is called a variable , i.e, a variable is any
characteristic that varies from one individual member of
the population to another. Examples of variables for
humans are height , weight, number of siblings, sex,
marital status, and eye color. The first three of these
variables yield numerical information (yield numerical
measurements ) and are examples of quantitative (or
numerical) variables, last three yield non -numerical
information (yield non -numerical measurements ) and are
examples of qualitative (or categorical) variables .
Quantitative variables can be classified as either discrete
or continuous.
Discrete variables . Some variables, such as the
numbers of children in family, the numbers of car accident
on the certain road on different days, or the numbers of
students taking basics of statistics course are the results
of counting and thus these are discrete variables.
Typically, a discrete variable is a variable whose possible
values are some or all of the ordinary counting
numbers like 0, 1, 2, 3, . . . . As a definition, we can say
that a variable is discrete if it has only a countable
number of distinct possible values. That is , a variable is
discrete if it can assume only a finite numbers of values
or as many values as there are integers.
Continuous variables . Quantities such as length,
weight, or temperature can in principle be measured
arbitrarily accurately. There is no indivisible unit. Weight
may be measured to the nearest gram, but it could be
measured more accurately, say to the tenth of a gram.
Such a variable, called continuous, is intrinsically
different from a discrete variable .

A discrete variable is a variable whose value is obtained by counting.

A continuous variable is a variable whose value is obtained by


measuring.
SCALES
S c a l e s f o r Q u a l i t a t i v e Va r i a b l e s . B e s i d e s b e i n g c l a s s i f i e d a s
either qualitative or quantitative, variables can be described
according to the scale on which they are defined. The scale of the
variable gives certain structure to the variable and also defines
the meaning of the variable.
The categories into which a qualitative variable falls may
or may not have a natural ordering. For example,
occupational categories have no natural ordering . If the
categories of a qualitative variable are unordered, then the
qualitative variable is said to be defined on a nominal scale ,
the word nominal referring to the fact that the categories are
merely names. If the categories can be put in order, the scale
is called an ordinal scale . Based on what scale a qualitative
variable is defined, the variable can be called as a nominal
variable or an ordinal variable. Examples of ordinal variables
are education (classified e.g. as low, high) and "strength of
opinion" on some proposal (classified according to whether
the individual favors the proposal , is indifferent towards it, or
opposites it), and position at the end of race (first,
second, etc.).
Scales for Quantitative Variables . Quantitative
variables, whether discrete or continuos, are defined
either on an interval scale or on a ratio scale . If one can
compare the differences between measurements of the
variable meaningfully, but not the ratio of the
measurements, then the quantitative variable is defined
on interval scale . If, on the other hand, one can compare
both the differences between measurements of the
variable and the ratio of the measurements meaningfully,
then the quantitative variable is defined on ratio scale. In
order to the ratio of the measurements being meaningful ,
the variable must have natural meaningful absolute zero
point, i.e, a ratio scale is an interval scale with a
meaningful absolute zero point. For example , temperature
measured on the Centigrade system is a interval variable
and the height of person is a ratio variable.
The difference between interval and ratio
scales is that, while interval scales are void of
absolute or true zero for example temperature
can be below 0 degree Celsius ( -10 or -20),
ratio scales have a true zero value, for
example, height or weight it will always be
measured between 0 to maximum but never
below 0.
ORGANIZATION OF THE DATA
O b s e rv in g t h e va lu e s o f t h e va ria b le s f o r o n e o r mo re
p e o p le o r t h in g s yie ld d a t a . E a ch in d ivid ua l p iece o f d a t a is
c a lle d a n o b s e rva t io n a n d t h e co lle ct io n o f a ll o b se rva t io n s f o r
p a rt icu la r v a ria b le s is ca lle d a d a t a se t o r d a t a ma t rix . Da t a se t
a re t h e va lu e s o f va ria b le s re co rd e d fo r a se t o f sa mp lin g u n its.
F o r e a s e in ma n ip u la t in g (re co rd in g a n d so rt in g ) t h e
v a lu e s o f t h e q u a lit a t ive va ria b le , t h e y a re o f t e n co d e d b y
a ssig n in g n u mb e rs t o t h e d iff e re n t ca t e g o rie s , a n d t h u s
c o n ve rt in g t h e ca t e g o rica l d a t a t o n u me rica l d a t a in a t rivia l
se n se . Fo r e xa mp le , ma rit a l st a t u s mig h t b e co d e d b y le ttin g
1 , 2 , 3 , a n d 4 d e n o t e a p e rso n ’s b e in g sin g le , ma rrie d , wid o we d ,
o r d ivo rc e d b u t st ill co d e d d a t a st ill co n t in u e s t o b e n o min a l
d a t a . Co d e d n u me rica l d a t a d o n o t sh a re a n y o f t h e p ro p e rt ie s
o f t h e n u mb e rs we d e a l wit h o rd in a ry a rit h me t ic. Wit h re g a rd s t o
t h e co d e s f o r ma rit a l st a t u s, we ca n n o t writ e 3 > 1 o r 2 < 4 , a n d
we ca n n o t writ e 2 − 1 = 4 − 3 o r 1 + 3 = 4 . Th is illu stra te s h o w
imp o rt a n t it is a lwa ys ch e ck wh e t h e r t h e ma t h e ma t ica l t re a t me n t
o f st a t is t ic a l d a t a is re a lly le g it ima t e .
QUALITATIVE VARIABLE

The number of observations that fall into particular


class (or category) of the qualitative variable is called the
frequency (or count) of that class. A table listing all
classes and their frequencies is called a frequency
distribution . In addition of the frequencies, we are often
interested in the percentage of a class. We find the
percentage by dividing the frequency of the class by
the total number of observations and multiplying the result
by 100. The percentage of the class, expressed as a
decimal, is usually referred to as the relative frequency
of the class.
A table listing all classes and their relative
frequencies is called a relative frequency distribution.
The relative frequencies provide the most relevant
information as to the pattern of the data. One should also
state the sample size, which serves as an indicator of the
creditability of the relative frequencies . Relative
frequencies sum to 1 (100%).
A cumulative frequency (cumulative relative
frequency) is obtained by summing the frequencies
(relative frequencies) of all classes up to the specific
class. In a case of qualitative variables, cumulative
frequencies makes sense only for ordinal variables, not
for nominal variables .

The qualitative data are presented graphically either as a


pie chart or as a horizontal or vertical bar graph.

A pie chart is a disk divided into pie -shaped pieces


proportional to the relative frequencies of the classes. To
obtain angle for any class, we multiply the relative
frequencies by 360 degrees, which corresponds to the
complete circle.
A horizontal bar graph displays the classes on the
horizontal axis and the frequencies (or relative
frequencies) of the classes on the vertical axis. The
frequency (or relative frequency) of each class is
represented by vertical bar whose height is equal to the
frequency (or relative frequency) of the class . In a bar
graph, its bars do not touch each other. At vertical bar
graph, the classes are displayed on the vertical axis and
the frequencies of the classes on the horizontal axis.
Nominal data is best displayed by pie chart and ordinal
data by horizontal or vertical bar graph.
Example 3.1. Let the blood types of 40 persons are
as follows:

O O A B A O A A A O B O B O O A O O A A A A AB A
B A A O O A O O A A A O A O O AB
QUANTITATIVE VARIABLE
The data of the quantitative variable can also presented
by a frequency distribution . If the discrete variable can obtain
only few different values, then the data of the discrete
variable can be summarized in a same way as qualitative
variables in a frequency table. In a place of the qualitative
categories , we now list in a frequency table the distinct
numerical measurements that appear in the discrete data set
and then count their frequencies .

If the discrete variable can have a lot of different values or


the quantitative variable is the continuous variable, then the
data must be grouped into classes (categories) before the
table of frequencies can be formed. The main steps in a
process of grouping quantitative variable into classes are:
(a) Find the minimum and the maximum values variable
have in the data set
(b) Choose intervals of equal length that cover the range
between the minimum and the maximum without
overlapping. These are called class intervals , and their
end points are called class limits .
(c) Count the number of observations in the data that
belongs to each class interval . The count in each class is
the class frequency.
(d) Calculate the relative frequencies of each class by
dividing the class frequency by the total number of
observations in the data.
The number in the middle of the class is called class
mark of the class. The number in the middle of the upper
class limit of one class and the lower class limit of the
other class is called the real class limit . As a rule of
thumb, it is generally satisfactory to group observed
values of numerical variable in a data into 5 to 15 class
intervals. A smaller number of intervals is used if number
of observations is relatively small; if the number of
observations is large, the number on intervals may be
greater than 15 .
The quantitative data are usually presented graphically
either as a histogram or as a horizontal or vertical bar
graph. The histogram is like a horizontal bar graph except
that its bars do touch each other. The histogram is formed
from grouped data, displaying either frequencies or
relative frequencies (percentages) of each class interval.
If quantitative data is discrete with only few possible
values, then the variable should graphically be presented
by a bar graph. Also if some reason it is more reasonable
to obtain frequency table for quantitative variable with
unequal class intervals, then variable should graphically
also be presented by a bar graph!

Example 3.2. Age (in years) of 102 people :

34,67,40,72,37,33,42,62,49,32,52,40,31,19,68,55,57,54,3
7,32,54,38,20,50,56,48,35,52,29,56,68,65,45,44,54,39,29,
56,43,42,22,30,26,20,48,29,34,27,40,28,45,21,42,38,29,2
6,62,35,28,24,44,46,39,29,27,40,22,38,42,39,26,48,39,25,
34,56,31,60,32,24,51,69,28,27,38,56,36,25,46,50,36,58,3
9,57,55,42,49,38,49,36,48,44
Example. Construct a frequency distribution table of
6 classes of the given datas.

70 81 73 66 69 78 68 53 64 68
57 26 42 36 50 20 61 36 51 53
72 44 44 52 77 106 52 69 35 39
73 56 46 67 33 30 35 64 61 73
56 72 40 29 56 68 55 86 88 83
MEAN, MEDIAN & MODE
Definition 1: Median.

The median is the middle number of a set of numbers


a r r a n g e d i n n u m e r i c a l o r d e r.

1. If the number of observation is odd, then the sample


m e d i a n i s t h e o b s e r v e d v a l u e e xa c t l y i n t h e m i d d l e o f t h e
ordered list.
2. If the number of observation is even, then the sample
median is the number halfway between the two middle
observed values in the ordered list.

Median =
Example. 7 participants in bike race had the following
finishing times in minutes:
28,22,26,29,21,23,24.
What is the median?

Example. 8 participants in bike race had the following


finishing times in minutes: 28,22,26,29,21,23,24,50.
What is the median ?

Example. 3, 10, 2, 8, 7, 5, 2, 5
What is the median ?
Definition 2: Mean.

The mean is the sum of all the values in a set, divided by


the number of values. The mean of a sample is usually denoted by
𝑥.

E xa m p l e . 7 p a r t i c i p a n t s i n b i k e r a c e h a d t h e f o l l o w i n g f i n i s h i n g
times in minutes: 28,22,26,29,21,23,24.
What is the mean?

E xa m p l e . 8 p a r t i c i p a n t s i n b i k e r a c e h a d t h e f o l l o w i n g f i n i s h i n g
times in minutes: 28,22,26,29,21,23,24,50.
What is the mean?
Definition 3: Mode.

The mode is the most frequent value in a set. A set can have
more than one mode; if it has two, it is said to be bimodal.

E xa m p l e 1 :
The mode of {1, 1, 2, 3, 5, 8} ?

The modes of {1, 3, 5, 7, 9, 9, 21, 25, 25, 31 } ?


GROUPED DATA

σ 𝑓𝑑 Where:
𝑀𝑒𝑎𝑛 = 𝐴𝑀 + (𝑖)
𝑁 AM = assumed mean
𝑁 f = frequency
−𝐹 d = deviation
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿 + 2 𝑖
𝑓 N = no. of data
i = interval
𝑀𝑜𝑑𝑒 = 3 𝑀𝑒𝑑𝑖𝑎𝑛 − 2(𝑀𝑒𝑎𝑛) L = lower limit
F = partial sum
𝐻𝑖𝑔ℎ𝑒𝑠𝑡 𝑛𝑜. −𝐿𝑜𝑤𝑒𝑠𝑡 𝑛𝑜.
𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 (𝑖) =
10
Ex. Find the mean, median and mode.
𝑋1 𝑋2 𝑋3
15 18 15
18 20 14
20 21 18 Ans.
17 25 17 Mean = 17.7
22 23 24 Median = 18.07
Mode = 18.81
15 18 23
13 14 18
18 17 20
10 14 21
19 9 14
Ex. Find the mean, median and mode.

𝑋1 𝑋2 𝑋3 𝑋4 𝑋5 𝑋6
46 80 57 59 94 76
48 48 61 65 86 65 Ans.

64 60 63 68 41 66 Mean = 63.83
Median = 64.75
Mode = 66.59
76 64 68 67 68 27
78 59 72 71 67 68
54 62 64 72 61 69
39 57 57 75 69 61
RANGE

Definition 5.1 (Range). The sample range of the


variable is the difference between its maximum
and minimum values in a data set
INTERQUARTILE RANGE
Definition 5.2 (Quartiles). Let n denote the number of
observations in a data set. Arrange the observed
values of variable in a data in increasing order.

Definition 5.3 (Interquartile range). The sample


interquartile range of the variable , denoted IQR, is
the difference between the first and third quartiles of
the variable , that is,
Example 5.4. 7 participants in bike race had
the following finishing times in minutes:
28,22,26,29,21,23,24.
What is the interquartile range?

Example 5.5. 8 participants in bike race had


the following finishing times in minutes:
28,22,26,29,21,23,24,50.
What is the interquartile range?
GROUPED DATA
Where:
𝑁
−𝐹
𝑄1 = 𝐿 + 4 𝑖 f = frequency
𝑓
𝑁 N = no. of data
−𝐹 i = interval
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝑄2 = 𝐿 + 2 𝑖 L = lower limit
𝑓
3𝑁 F = partial sum
−𝐹 𝑄1 = First Quartile
𝑄3 = 𝐿 + 4 𝑖
𝑓 𝑄2 = Second Quartile
𝑄3 = Third Quartile
Ex. Find 𝑸 𝟏 , 𝑸 𝟐 , 𝑸 𝟑 interval f
65-69 2
60-64 4
55-59 7
Ans.
50-54 12
𝑄1 = 36.06 45-49 15
𝑄2 = 45.5 40-44 10
𝑄3 = 52.21
35-39 8
30-34 7
25-29 6
20-24 3
Ex. Find 𝑸 𝟏 , 𝑸 𝟐 , 𝑸 𝟑
interval f
96-99 3
92-95 5
Ans.
88-91 10
𝑄1 = 74.3 84-87 16
𝑄2 = 80.1 80-83 20
𝑄3 = 85.625
76-79 18
72-75 15
68-71 10
64-67 5
STANDARD DEVIATION

“ Standard
deviation is a measure that is used to
quantify the amount of variation or dispersion of a set of
data values .”
Food for the brain: If based on total population, the
s t a n d a r d d e v i a t i o n i s c a l l e d p o p u l a t i on s t a n d a r d d e v i a t i o n w h i l e
if based only on a random sample, it is called sample standard
deviation.
SAMPLE STANDARD DEVIATION

The sample standard deviation is the most frequently


used measure of variability, although it is not as easily
understood as ranges. It can be considered as a kind of
average of the absolute deviations of observed values
from the mean of the variable in question .

σ(𝑥−𝑥)2
𝑠=
𝑛−1
→ Sample standard deviation

where: s = sample standard deviation


𝑥 = mean of the random sample
n = number of random samples
In the formula of the standard deviation, the sum of the
squared deviations from the mean, is called sum of squared
deviations and provides a measure of total deviation from the
mean for all the observed values of the variable. Once the
sum of squared deviations is divided by n − 1, we get a formula,
which is called the sample variance.

2 σ(𝑥−𝑥)2
𝑠 =
𝑛−1
→ Sample variance
where: 𝑠 2 = sample variance
𝑥 = mean of the random sample
n = number of random samples
Ex. 7 participants in bike race had the following finishing
times in minutes: 28,22,26,29,21,23,24 .

What is the sample standard deviation and the sample


variance?

Ex. Solve the sample variance of the following numbers:


2, 4, 6, 8, 10, 12, 14
POPULATION STANDARD DEVIATION
The population mean is the average of the population
measurements . The population standard deviation
describes the variation of the population measurements
about the population mean.

σ(𝑥−𝜇)2
𝜎=
𝑁
→ Population standard deviation
where: 𝜎 = population standard deviation
𝜇 = mean of the population data
N = total
numbernumber of population
of random samples
2 σ(𝑥−𝜇)2
𝜎 =
𝑁
→ Population variance
where: 𝜎 2 = population variance
𝜇 = mean of the population data
N = total number of populations
Ex. The monthly rainfall (in inches) in a given place are as
follows: Jan, 1 in; Feb, 2 in; Mar, 4 in; Apr, 6 in; May, 18
in; June, 37 in; July, 31 in; Aug, 16 in; Sept, 28 in; Oct, 24
in; Nov, 9 in; and Dec, 4 in. What is the standard deviation
of this data?

Ex. During the years of the Great Depression, the weekly


average hours worked in the manufacturing jobs were 45,
41, 43, 39, 39, 35, 37, 40, 39, 36 and 37 respectively.
What is the variance?
GROUPED DATA

σ 2
σ 𝑓(𝑥 − 𝑥)2 𝑓(𝑥 − 𝑥)
𝑠= 𝑠2 =
σ𝑓 − 1 σ𝑓 − 1

where: 𝑠 2 = variance
where: s = standard deviation
𝑥 = mean of the data
𝑥 = mean of the data
f = frequency
f = frequency
Ex. Find the standard deviation and the
variance.`
x Frequency
22-24 5
19-21 6
16-18 7
13-15 8
10-12 4
Ans.
S = 3.94
Variance = 15.52
Ex. The data represents the ages of 40 women when they
each had a boyfriend. Construct a grouped frequency
distribution with a class of 5 and find the standard
deviation and variance of the data.

18 20 20 20 20 21 20 17 19 20
13 18 22 26 20 19 22 15 18 27
16 23 24 17 25 24 16 20 16 15
21 17 23 16 21 17 26 16 23 19

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy