0% found this document useful (0 votes)

21 views88 pages

Numerical Measures New

The document presents an overview of descriptive statistics, focusing on numerical measures such as location, variability, and distribution shape. It details various statistical measures including mean, median, mode, percentiles, and quartiles, with examples related to apartment rents. The content is structured to aid understanding of how to analyze and interpret data effectively.

Uploaded by

XIAO LA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views88 pages

Numerical Measures New

Uploaded by

XIAO LA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 88

Statistics for Business

and Economics
Anderson Sweeney
Williams
Slides by
John Loucks
St. Edward’s University

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
1
or duplicated, or posted to a publicly accessible website, in whole or in part.
Descriptive Statistics: Numerical
Measures
 1. Measures of Location
 2. Measures of Variability
 3. Measures of Distribution Shape, Relative
Location, and Detecting Outliers
 4. Exploratory Data Analysis
 5. Measures of Association Between Two
Variables
 6. The Weighted Mean and
Working with Grouped Data

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
2
or duplicated, or posted to a publicly accessible website, in whole or in part.
1. Measures of Location

 Mean
If the measures are computed
 Median
for data from a sample,
 Mode they are called sample statistics.
 Percentiles
 Quartiles If the measures are computed
for data from a population,
they are called population parameters.

A sample statistic is referred to

as the point estimator of the
corresponding population parameter.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
3
or duplicated, or posted to a publicly accessible website, in whole or in part.
1.1 Mean

 Perhaps the most important measure of

location is the mean.
 The mean provides a measure of central
 location
The mean . of a data set is the average of all
the data values.
 The sample mean x is the point estimator of
the population mean .

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
4
or duplicated, or posted to a publicly accessible website, in whole or in part.
Sample Mean x

Sum of the values

of the n observations

x i
x
n
Number of
observations
in the sample

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
5
or duplicated, or posted to a publicly accessible website, in whole or in part.
Population Mean 

Sum of the values

of the N observations

x i

N
Number of
observations in
the population

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
6
or duplicated, or posted to a publicly accessible website, in whole or in part.
Sample Mean

 Example: Apartment Rents

Seventy efficiency apartments were
randomly
sampled in a small college town. The
monthly rent
445 615 430 590 435 600 460 600 440 615
prices
440 440 for these
440 525 apartments
425 445 are
575 listed
445 below.
450 450
465 450 525 450 450 460 435 460 465 480
450 470 490 472 475 475 500 480 570 465
600 485 580 470 490 500 549 500 500 480
570 515 450 445 525 535 475 550 480 510
510 575 490 435 600 435 445 435 430 440

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
7
or duplicated, or posted to a publicly accessible website, in whole or in part.
Sample Mean

 Example: Apartment Rents

x
 x
i34,356
 490.80
n 70
445 615 430 590 435 600 460 600 440 615
440 440 440 525 425 445 575 445 450 450
465 450 525 450 450 460 435 460 465 480
450 470 490 472 475 475 500 480 570 465
600 485 580 470 490 500 549 500 500 480
570 515 450 445 525 535 475 550 480 510
510 575 490 435 600 435 445 435 430 440

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
8
or duplicated, or posted to a publicly accessible website, in whole or in part.
1.2 Median

 The median of a data set is the value in the middle

when the data items are arranged in ascending ord
 Whenever a data set has extreme values, the media
is the preferred measure of central location.
 The median is the measure of location most often
reported for annual income and property value data
 A few extremely large incomes or property values
can inflate the mean.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
9
or duplicated, or posted to a publicly accessible website, in whole or in part.
Median

 For an odd number of observations:

26 18 27 12 14 27 19 7 observations

12 14 18 19 26 27 27 in ascending order

the median is the middle value.

Median = 19

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
10
or duplicated, or posted to a publicly accessible website, in whole or in part.
Median

 For an even number of observations:

26 18 27 12 14 27 30 19 8 observations

12 14 18 19 26 27 27 30 in ascending order

the median is the average of the middle two values.

Median = (19 + 26)/2 = 22.5

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
11
or duplicated, or posted to a publicly accessible website, in whole or in part.
Median

 Example: Apartment Rents

Averaging the 35th and 36th data values:
Median = (475 + 475)/2 = 475
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

Note: Data is in ascending order.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
12
or duplicated, or posted to a publicly accessible website, in whole or in part.
Trimmed Mean

 Another measure, sometimes used when extreme

values are present, is the trimmed mean.
 It is obtained by deleting a percentage of the
smallest and largest values from a data set and the
computing the mean of the remaining values.
 For example, the 5% trimmed mean is obtained by
removing the smallest 5% and the largest 5% of the
data values and then computing the mean of the
remaining values.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
13
or duplicated, or posted to a publicly accessible website, in whole or in part.
1.3 Mode

 The mode of a data set is the value that occurs with

greatest frequency.
 The greatest frequency can occur at two or more
different values.
 If the data have exactly two modes, the data are
bimodal.
 If the data have more than two modes, the data are
multimodal.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
14
or duplicated, or posted to a publicly accessible website, in whole or in part.
Mode

 Example: Apartment Rents

450 occurred most frequently (7 times)
Mode = 450
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

Note: Data is in ascending order.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
15
or duplicated, or posted to a publicly accessible website, in whole or in part.
1.4 Percentiles

 A percentile provides information about how the

data are spread over the interval from the smallest
value to the largest value.
 Admission test scores for colleges and universities
are frequently reported in terms of percentiles.
 The pth percentile of a data set is a value such
that at least p percent of the items take on this
value or less and at least (100 - p) percent of
the items take on this value or more.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
16
or duplicated, or posted to a publicly accessible website, in whole or in part.
Percentiles

Arrange the data in ascending order.

Compute index i, the position of the pth percentile.

i = (p/100)n

If i is not an integer, round up. The p th percentile

is the value in the i th position.

If i is an integer, the p th percentile is the average

of the values in positions i and i +1.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
17
or duplicated, or posted to a publicly accessible website, in whole or in part.
80th Percentile

 Example: Apartment Rents

i = (p/100)n = (80/100)70 = 56
Averaging the 56th and 57th data values:
80th Percentile = (535 + 549)/2 = 542
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

Note: Data is in ascending order.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
18
or duplicated, or posted to a publicly accessible website, in whole or in part.
80th Percentile

 Example: Apartment Rents

“At least 80% of the “At least 20% of the
items take on a items take on a
value of 542 or less.” value of 542 or more.”
56/70 = .8 or 80% 14/70 = .2 or 20%
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
19
or duplicated, or posted to a publicly accessible website, in whole or in part.
1.5 Quartiles

 Quartiles are specific percentiles.

 First Quartile = 25th Percentile
 Second Quartile = 50th Percentile = Median
 Third Quartile = 75th Percentile

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
20
or duplicated, or posted to a publicly accessible website, in whole or in part.
Third Quartile

 Example: Apartment Rents

Third quartile = 75th percentile
i = (p/100)n = (75/100)70 = 52.5 = 53
Third quartile = 525
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Note: Data is in ascending
order.
“At least 75% of the items take on a value of 525 or less.”
55/70=78.6%
“At least 25% of the items take on a value of 525 or more.”
18/70
© 2014 Cengage Learning. All Rights =25.7%
Reserved. May not be scanned, copied
Slide
21
or duplicated, or posted to a publicly accessible website, in whole or in part.
2. Measures of Variability

 It is often desirable to consider measures of variabil

(dispersion), as well as measures of location.
 For example, in choosing supplier A or supplier B we
might consider not only the average delivery time f
each, but also the variability in delivery time for ea

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
22
or duplicated, or posted to a publicly accessible website, in whole or in part.
Measures of Location: Example

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
23
or duplicated, or posted to a publicly accessible website, in whole or in part.
Measures of Variability

 2.1 Range
 2.2 Interquartile Range
 2.3 Variance
 2.4 Standard Deviation
 2.5 Coefficient of Variation

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
24
or duplicated, or posted to a publicly accessible website, in whole or in part.
2.1 Range

 The range of a data set is the difference between th

largest and smallest data values.
 It is the simplest measure of variability.
 It is very sensitive to the smallest and largest data
values.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
25
or duplicated, or posted to a publicly accessible website, in whole or in part.
Range

 Example: Apartment Rents

Range = largest value - smallest value
Range = 615 - 425 = 190
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

Note: Data is in ascending order.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
26
or duplicated, or posted to a publicly accessible website, in whole or in part.
2.2 Interquartile Range

 The interquartile range of a data set is the differenc

between the third quartile and the first quartile.
 It is the range for the middle 50% of the data.
 It overcomes the sensitivity to extreme data values

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
27
or duplicated, or posted to a publicly accessible website, in whole or in part.
Interquartile Range

 Example: Apartment Rents

3rd Quartile (Q3) = 525
1st Quartile (Q1) = 445
Interquartile Range = Q3 - Q1 = 525 - 445 = 80
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

Note: Data is in ascending order.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
28
or duplicated, or posted to a publicly accessible website, in whole or in part.
Measures of Variability: Example 1

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
29
or duplicated, or posted to a publicly accessible website, in whole or in part.
2.3 Variance

The variance is a measure of variability that utilizes

all the data.

The variance is useful in comparing the variability

of two or more variables.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
30
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variance

The variance is the average of the squared

differences between each data value and the mean.
mean

The variance is computed as follows:

2 2
 ( xi  x )  ( xi   )
s2  2
 
n 1 N
for a for a
sample population

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
31
or duplicated, or posted to a publicly accessible website, in whole or in part.
2.4 Standard Deviation

The standard deviation of a data set is the positive

square root of the variance.

It is measured in the same units as the data, making

it more easily interpreted than the variance.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
32
or duplicated, or posted to a publicly accessible website, in whole or in part.
Standard Deviation

The standard deviation is computed as follows:

s  s2   2

for a for a
sample population

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
33
or duplicated, or posted to a publicly accessible website, in whole or in part.
2.5 Coefficient of Variation

The coefficient of variation indicates how large the

standard deviation is in relation to the mean.

The coefficient of variation is computed as follows:

s   
 100  %  100  %
x   
for a for a
sample population

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
34
or duplicated, or posted to a publicly accessible website, in whole or in part.
Sample Variance, Standard Deviation,
And Coefficient of Variation
 Example: Apartment Rents

• Variance  i
( x  x ) 2

s2   2,996.16
n 1

• Standard Deviation the

standard
s  s  2996.16  54.74 deviation is
2

about 11%
• Coefficient of Variation of the
mean
s   54.74 
  100  %    100  %  11.15%
x   490.80 
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
35
or duplicated, or posted to a publicly accessible website, in whole or in part.
Measures of Variability: Example 2

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
36
or duplicated, or posted to a publicly accessible website, in whole or in part.
3. Measures of Distribution Shape,
Relative Location, and Detecting Outliers
 3.1 Distribution Shape
 3.2 z-Scores
 3.3 Chebyshev’s Theorem
 3.4 Empirical Rule
 3.5 Detecting Outliers

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
37
or duplicated, or posted to a publicly accessible website, in whole or in part.
3.1 Distribution Shape: Skewness
 An important measure of the shape of a
distribution is called skewness.
 The formula for the skewness of sample data
is 3
n  xi  x 
Skewness   
( n  1)( n  2 )  s 


 Skewness can be easily computed using

statistical software.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
38
or duplicated, or posted to a publicly accessible website, in whole or in part.
Distribution Shape: Skewness
 Symmetric (not skewed)
• Skewness is zero.
• Mean and median are equal.
.35
Skewness =
0
Relative Frequency

.30
.25
.20
.15
.10
.05
0

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
39
or duplicated, or posted to a publicly accessible website, in whole or in part.
Distribution Shape: Skewness
 Moderately Skewed Left
• Skewness is negative.
• Mean will usually be less than the median.
.35
Skewness = .31
Relative Frequency

.30
.25
.20
.15
.10
.05
0

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
40
or duplicated, or posted to a publicly accessible website, in whole or in part.
Distribution Shape: Skewness
 Moderately Skewed Right
• Skewness is positive.
• Mean will usually be more than the median.
.35
Skewness = .31
Relative Frequency

.30
.25
.20
.15
.10
.05
0

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
41
or duplicated, or posted to a publicly accessible website, in whole or in part.
Distribution Shape: Skewness
 Highly Skewed Right
• Skewness is positive (often above 1.0).
• Mean will usually be more than the median.
.35
Skewness = 1.25
Relative Frequency

.30
.25
.20
.15
.10
.05
0

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
42
or duplicated, or posted to a publicly accessible website, in whole or in part.
Distribution Shape: Skewness
 Example: Apartment Rents
Seventy efficiency apartments were
randomly
sampled in a college town. The monthly rent
prices
425 430 430 435 435 435 435 435 440 440
for the apartments are listed below in
440 440 440 445 445 445 445 445 450 450
450ascending
450 450 order.
450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
43
or duplicated, or posted to a publicly accessible website, in whole or in part.
Distribution Shape: Skewness
 Example: Apartment Rents

.35 Skewness = .92

Relative Frequency

.30

.25

.20
.15

.10
.05
0

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
44
or duplicated, or posted to a publicly accessible website, in whole or in part.
3.2 z-Scores

The
The z-score
z-score is
is often
often called
called the
the standardized
standardized value.
value.

It
It denotes
denotes the the number
number of
of standard
standard deviations
deviations aa data
data
value
value xxii is
is from
from the
the mean.
mean.

xi  x
zi 
s

Excel’s
Excel’s STANDARDIZE
STANDARDIZE function
function can
can be
be used
used to
to
compute
compute the
the z-score.
z-score.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
45
or duplicated, or posted to a publicly accessible website, in whole or in part.
z-Scores

 An observation’s z-score is a measure of the relative

location of the observation in a data set.
 A data value less than the sample mean will have a
z-score less than zero.
 A data value greater than the sample mean will hav
a z-score greater than zero.
 A data value equal to the sample mean will have a
z-score of zero.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
46
or duplicated, or posted to a publicly accessible website, in whole or in part.
z-Scores

 Example: Apartment Rents

• z-Score of Smallest Value (425)
xi  x 425  490.80
z    1.20
s 54.74

Standardized Values for Apartment Rents

-1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93
-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75
-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47
-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20
-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.35
0.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.45
1.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
47
or duplicated, or posted to a publicly accessible website, in whole or in part.
3.3 Chebyshev’s Theorem

At
At least
least (1
(1 -- 1/
1/zz22)) of
of the
the items
items in
in any
any data
data set
set will
will be
be
within
within zz standard
standard deviations
deviations of
of the
the mean,
mean, where
where zz is
is
any
any value
value greater
greater than than 1.
1.

Chebyshev’s
Chebyshev’s theorem
theorem requires
requires zz >
> 11,, but
but zz need
need not
not
be
be an
an integer.
integer.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
48
or duplicated, or posted to a publicly accessible website, in whole or in part.
Chebyshev’s Theorem

At least75%
At least of
of the
the data
data values
values must
must be
be
within z = 2 standard deviations
within of
of the
the mean.
mean

At least89%
At least of
of the
the data
data values
values must
must be
be
within z = 3 standard deviations
within of
of the
the mean.
mean

At least94%
At least of
of the
the data
data values
values must
must be
be
within z = 4 standard deviations
within of
of the
the mean.
mean

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
49
or duplicated, or posted to a publicly accessible website, in whole or in part.
Chebyshev’s Theorem

 Example: Apartment Rents

Let z = 1.5 with x= 490.80 and s = 54.74

At least (1  1/(1.5)2) = 1  0.44 = 0.56 or 56%

of the rent values must be between Important!
x - z(s) = 490.80  1.5(54.74) = 409
The true
proportions
and found within
x + z(s) = 490.80 + 1.5(54.74) = 573
the indicated
regions could
be greater
(Actually, 86% of the rent values than what
the theorem
are between 409 and 573.)
guarantees.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
50
or duplicated, or posted to a publicly accessible website, in whole or in part.
3.4 Empirical Rule

When the data are believed to approximate a

bell-shaped distribution …

The
The empirical
empirical rule
rule can
can be
be used
used to
to determine
determine the
the
percentage
percentage of
of data
data values
values that
that must
must be
be within
within aa
specified
specified number
number ofof standard
standard deviations
deviations of
of the
the
mean.
mean.

The
The empirical
empirical rule
rule is
is based
based on
on the
the normal
normal
distribution,
distribution, which
which is
is covered
covered in
in Chapter
Chapter 6.
6.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
51
or duplicated, or posted to a publicly accessible website, in whole or in part.
Empirical Rule

For data having a bell-shaped distribution:

68.26% of of the
the values
values of
of aa normal
normal random
random variable
variable
are within+/- 1 standard deviation
are within of
of its
its mea
me

95.44% of of the
the values
values of
of aa normal
normal random
random variable
variable
are within+/- 2 standard deviations
are within of
of its
its mea
me

99.72% of of the
the values
values of
of aa normal
normal random
random variable
variable
are within+/- 3 standard deviations
are within of
of its
its mea
me

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
52
or duplicated, or posted to a publicly accessible website, in whole or in part.
Empirical Rule

99.72%
95.44%
68.26%


x
 – 3  – 1  + 1  + 3
 – 2  + 2

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
53
or duplicated, or posted to a publicly accessible website, in whole or in part.
Empirical Rule

99.72%
95.44%

68.26%



Tel. bill
   
 

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
54
or duplicated, or posted to a publicly accessible website, in whole or in part.
Chebyshev's Theorem vs Empirical
Rule

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
55
or duplicated, or posted to a publicly accessible website, in whole or in part.
3.5 Detecting Outliers

 An outlier is an unusually small or unusually large

value in a data set.
 A data value with a z-score less than -3 or greater
than +3 might be considered an outlier.
 It might be:
• an incorrectly recorded data value
• a data value that was incorrectly included in the
data set
• a correctly recorded data value that belongs in
the data set

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
56
or duplicated, or posted to a publicly accessible website, in whole or in part.
Detecting Outliers

 Example: Apartment Rents

• The most extreme z-scores are -1.20 and 2.27
• Using |z| > 3 as the criterion for an outlier, there
are no outliers in this data set.

Standardized Values for Apartment Rents

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
57
or duplicated, or posted to a publicly accessible website, in whole or in part.
4. Exploratory Data Analysis

Exploratory
Exploratory data
data analysis
analysis procedures
procedures enable
enable us
us to
to use
us
simple
simple arithmetic
arithmetic and
and easy-to-draw
easy-to-draw pictures
pictures to
to
summarize
summarize data.
data.

We
We simply
simply sort
sort the
the data
data values
values into
into ascending
ascending order
order
and
and identify
identify the
the five-number
five-number summary
summary andand then
then
construct
construct aa box
box plot
plot..

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
58
or duplicated, or posted to a publicly accessible website, in whole or in part.
4.1 Five-Number Summary

1 Smallest Value

2 First Quartile

3 Median

4 Third Quartile

5 Largest Value

 Example: Apartment Rents

Lowest Value = 425 First Quartile = 445

Median = 475
Third Quartile = 525 Largest Value = 615
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

A
A box
box plot
plot is
is aa graphical
graphical summary
summary of
of data
data that
that is
is
based
based on
on aa five-number
five-number summary.
summary.

A
A key
key to
to the
the development
development ofof aa box
box plot
plot is
is the
the
computation
computation of of the
the median
median and
and the
the quartiles
quartiles Q Q11 and
and
Q
Q33..

Box
Box plots
plots provide
provide another
another way
way to
to identify
identify outliers.
outliers.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
61
or duplicated, or posted to a publicly accessible website, in whole or in part.
Box Plot
 Example: Apartment Rents
• A box is drawn with its ends located at the first an
third quartiles.
• A vertical line is drawn in the box at the location of
the median (second quartile).

40 42 45 47 50 52 55 57 60 62
0 5 0 5 0 5 0 5 0 5
Q1 = 445 Q3 = 525
Q2 = 475
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
62
or duplicated, or posted to a publicly accessible website, in whole or in part.
Box Plot

 Limits are located (not drawn) using the

interquartile range (IQR).
 Data outside these limits are considered
 outliers .
The locations of each outlier is shown with the
symbol * .
continued

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
63
or duplicated, or posted to a publicly accessible website, in whole or in part.
Box Plot
 Example: Apartment Rents
• The lower limit is located 1.5(IQR) below Q1.
Lower Limit: Q1 - 1.5(IQR) = 445 - 1.5(80) = 325

• The upper limit is located 1.5(IQR) above Q3.

Upper Limit: Q3 + 1.5(IQR) = 525 + 1.5(80) = 645

• There are no outliers (values less than 325 or

greater than 645) in the apartment rent data.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
64
or duplicated, or posted to a publicly accessible website, in whole or in part.
Box Plot
 Example: Apartment Rents
• Whiskers (dashed lines) are drawn from the ends
of the box to the smallest and largest data values
inside the limits.

40 42 45 47 50 52 55 57 60 62
0 5 0 5 0 5 0 5 0 5
Smallest value Largest value
inside limits = 425 inside limits = 615
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
65
or duplicated, or posted to a publicly accessible website, in whole or in part.
Box Plot

An excellent graphical technique for

making
comparisons among two or more
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
groups.
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slide
67
5. Measures of Association
Between Two Variables
Thus
Thus far
far we
we have
have examined
examined numerical
numerical methods
methods used
used
to
to summarize
summarize the
the data
data for
for one
one variable
variable at
at aa time.
time.

Often
Often aa manager
manager or
or decision
decision maker
maker isis interested
interested in
in
the
the relationship
relationship between
between two
two variables
variables..

Two
Two descriptive
descriptive measures
measures ofof the
the relationship
relationship
between
between two
two variables
variables are
are covariance
covariance andand correlation
correlation
coefficient
coefficient..

The
The covariance
covariance is
is aa measure
measure of
of the
the linear
linear association
association
between
between two
two variables.
variables.

Positive
Positive values
values indicate
indicate aa positive
positive relationship.
relationship.

Negative
Negative values
values indicate
indicate aa negative
negative relationship.
relationship.

The
The covariance
covariance is
is computed
computed as
as follows:
follows:

 ( xi  x )( yi  y ) for
s xy 
n 1 samples

 ( xi   x )( yi   y ) for
 xy  populations
N

Correlation
Correlation is
is aa measure
measure of
of linear
linear association
association and
and not
no
necessarily
necessarily causation.
causation.

Just
Just because
because two
two variables
variables are
are highly
highly correlated,
correlated, it
it
does
does not
not mean
mean that
that one
one variable
variable is
is the
the cause
cause of
of the
the
other.
other.

The
The correlation
correlation coefficient
coefficient is
is computed
computed as
as follows:
follows:
sxy  xy
rxy   xy 
sx s y  x y

for for
samples populations

The
The coefficient
coefficient can
can take
take on
on values
values between
between -1
-1 and
and +1
+

Values
Values near
near -1-1 indicate
indicate aa strong
strong negative
negative linear
linear
relationship
relationship..

Values
Values near
near +1+1 indicate
indicate aa strong
strong positive
positive linear
linear
relationship
relationship..

The
The closer
closer the
the correlation
correlation is
is to
to zero,
zero, the
the weaker
weaker the
the
relationship.
relationship.

 Example: Golfing Study

A golfer is interested in investigating the
relationship, if any, between driving distance
and 18-hole score.
Average Driving Average
Distance (yds.) 18-Hole Score
277.6 69
259.5 71
269.1 70
267.0 70
255.6 71
272.9 69

 Example: Golfing Study

x y (xi  x ) (yi  y ) (xi  x )(yi  y )

277.6 69 10.65 -1.0 -10.65
259.5 71 -7.45 1.0 -7.45
269.1 70 2.15 0 0
267.0 70 0.05 0 0
255.6 71 -11.35 1.0 -11.35
272.9 69 5.95 -1.0 -5.95
Average 267.0 70.0 Total -35.40
Std. Dev. 8.2192.8944

 Example: Golfing Study

• Sample Covariance
sxy 
 (x  x )(y  y )  35.40
i i
   7.08
n 1 6 1
• Sample Correlation Coefficient
sxy  7.08
rxy    -.9631
sxsy (8.2192)(.8944)

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
76
or duplicated, or posted to a publicly accessible website, in whole or in part.
6. The Weighted Mean and
Working with Grouped Data
 6.1 Weighted Mean
 6.2 Mean for Grouped Data
 6.3 Variance for Grouped Data
 6.4 Standard Deviation for Grouped
Data

 When the mean is computed by giving each data

value a weight that reflects its importance, it is
referred to as a weighted mean.
 In the computation of a grade point average (GPA),
the weights are the number of credit hours earned
each grade.
 When data values vary in importance, the analyst
must choose the weight that best reflects the
importance of each value.

x  wx i i

w i

where:
xi = value of observation i
wi = weight for observation i

 The weighted mean computation can be used to

obtain approximations of the mean, variance, and
standard deviation for the grouped data.
 To compute the weighted mean, we treat the
midpoint of each class as though it were the mean
of all items in the class.
 We compute a weighted mean of the class midpoint
midpoin
using the class frequencies as weights.
 Similarly, in computing the variance and standard
deviation, the class frequencies are used as weight

 Sample Data

x  fM i i

 Population
Data
  fM i i

N
where:
fi = frequency of class i
Mi = midpoint of class i

 Example: Apartment Rents

The previously presented sample of apartment
rents is shown here as grouped data in the form of
a frequency distribution. Rent ($) Frequency
420-439 8
440-459 17
460-479 12
480-499 8
500-519 7
520-539 4
540-559 2
560-579 4
580-599 2
600-619 6

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
82
or duplicated, or posted to a publicly accessible website, in whole or in part.
Sample Mean for Grouped Data
 Example: Apartment Rents

Rent ($) fi Mi f iMi

420-439 8 429.5 3436.0 34,525
x  493.21
440-459 17 449.5 7641.5 70
460-479 12 469.5 5634.0 This approximation
480-499 8 489.5 3916.0
differs by $2.41 from
500-519 7 509.5 3566.5
520-539 4 529.5 2118.0 the actual sample
540-559 2 549.5 1099.0 mean of $490.80.
560-579 4 569.5 2278.0
580-599 2 589.5 1179.0
600-619 6 609.5 3657.0
Total 70 34525.0

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
83
or duplicated, or posted to a publicly accessible website, in whole or in part.
6.3 Variance for Grouped Data
 For sample data

2  fi ( Mi  x )2
s 
n 1

 For population data

 f ( M   ) 2
2  i i
N

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
84
or duplicated, or posted to a publicly accessible website, in whole or in part.
Sample Variance for Grouped Data
 Example: Apartment Rents

Rent ($) fi Mi Mi - x (M i - x )2 f i (M i - x )2
420-439 8 429.5 -63.7 4058.96 32471.71
440-459 17 449.5 -43.7 1910.56 32479.59
460-479 12 469.5 -23.7 562.16 6745.97
480-499 8 489.5 -3.7 13.76 110.11
500-519 7 509.5 16.3 265.36 1857.55
520-539 4 529.5 36.3 1316.96 5267.86
540-559 2 549.5 56.3 3168.56 6337.13
560-579 4 569.5 76.3 5820.16 23280.66
580-599 2 589.5 96.3 9271.76 18543.53
600-619 6 609.5 116.3 13523.36 81140.18
Total 70 208234.29
continued
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
85
or duplicated, or posted to a publicly accessible website, in whole or in part.
Sample Variance for Grouped Data
 Example: Apartment Rents
• Sample Variance
s2 = 208,234.29/(70 – 1) = 3,017.89

• Sample Standard Deviation

s  3,017.89  54.94

This approximation differs by only $.20

from the actual standard deviation of $54.74.

1. How would you explain the

difference between correlation and
covariance? List out the key
differences between covariance and
correlation.

2. What is the difference between

variance and covariance?

Session 2 Inferential Statistics Slides
100% (1)
Session 2 Inferential Statistics Slides
93 pages
Newbold SBE9e Accessible CH02
No ratings yet
Newbold SBE9e Accessible CH02
64 pages
Part 3 - Mesaures
No ratings yet
Part 3 - Mesaures
68 pages
Lecture - 04 - TP
No ratings yet
Lecture - 04 - TP
126 pages
Chapter 03 1
No ratings yet
Chapter 03 1
79 pages
Lind 19e Chap003 PPT Accessible
No ratings yet
Lind 19e Chap003 PPT Accessible
46 pages
Probability Theory & Statistics: Describing Data: Numerical
No ratings yet
Probability Theory & Statistics: Describing Data: Numerical
36 pages
Chapter 03
No ratings yet
Chapter 03
67 pages
Slides Prepared by John S. Loucks St. Edward's University
No ratings yet
Slides Prepared by John S. Loucks St. Edward's University
28 pages
Chapter 1
No ratings yet
Chapter 1
44 pages
Unit 4 - Descriptive Statistics (A)
No ratings yet
Unit 4 - Descriptive Statistics (A)
19 pages
Lecture 3 Summarizing Data Measures of Central Location and Sampling
No ratings yet
Lecture 3 Summarizing Data Measures of Central Location and Sampling
53 pages
Data Science: Descriptive Statistics
No ratings yet
Data Science: Descriptive Statistics
96 pages
2 Descriptives
No ratings yet
2 Descriptives
43 pages
Statistics For Managers Using Microsoft Excel: 5 Edition
No ratings yet
Statistics For Managers Using Microsoft Excel: 5 Edition
54 pages
Descriptive Statistics - Numerical Measure
No ratings yet
Descriptive Statistics - Numerical Measure
33 pages
Stat As Tics and Their Use in HR 179
No ratings yet
Stat As Tics and Their Use in HR 179
27 pages
Chap3 A YzQ6R
No ratings yet
Chap3 A YzQ6R
52 pages
Descriptive Statistics: Numerical Measures: Measures of Location (Central Tendency) Measures of Variability
No ratings yet
Descriptive Statistics: Numerical Measures: Measures of Location (Central Tendency) Measures of Variability
68 pages
Slides Prepared by John S. Loucks St. Edward's University
No ratings yet
Slides Prepared by John S. Loucks St. Edward's University
59 pages
Ch03 New
No ratings yet
Ch03 New
27 pages
CH03
No ratings yet
CH03
59 pages
Part 2-Chapter 3 - Describing Data - Edit
No ratings yet
Part 2-Chapter 3 - Describing Data - Edit
46 pages
Chapter 03
No ratings yet
Chapter 03
52 pages
Anderson PPT Ch03
No ratings yet
Anderson PPT Ch03
55 pages
Statistics For Economists: Lecturer: DR Omid Mazdak Email: Omid - Mazdak@kcl - Ac.uk
No ratings yet
Statistics For Economists: Lecturer: DR Omid Mazdak Email: Omid - Mazdak@kcl - Ac.uk
25 pages
Intro W03 Rev
No ratings yet
Intro W03 Rev
23 pages
Chapter 3
No ratings yet
Chapter 3
98 pages
Chapter 3A
No ratings yet
Chapter 3A
41 pages
Chap 3 A
No ratings yet
Chap 3 A
42 pages
Slides Week2
No ratings yet
Slides Week2
43 pages
Descriptive Statistics: Numerical Measures: Measures of Location Measures of Variability
100% (1)
Descriptive Statistics: Numerical Measures: Measures of Location Measures of Variability
68 pages
Lecture 04
No ratings yet
Lecture 04
88 pages
Descriptive Statistics - Numerical Measures
No ratings yet
Descriptive Statistics - Numerical Measures
91 pages
Measures of Location (Central Tendency) Measures of Variability
No ratings yet
Measures of Location (Central Tendency) Measures of Variability
68 pages
PowerPoint CH 03a
100% (1)
PowerPoint CH 03a
34 pages
Lecture 2 Core Statistics 101 Mean Median Mode Distribution
No ratings yet
Lecture 2 Core Statistics 101 Mean Median Mode Distribution
32 pages
Statistics For Data Science 1
No ratings yet
Statistics For Data Science 1
65 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
38 pages
Slides Prepared by John S. Loucks St. Edward's University: 1 Slide © 2003 Thomson/South-Western
No ratings yet
Slides Prepared by John S. Loucks St. Edward's University: 1 Slide © 2003 Thomson/South-Western
28 pages
Lecture 3
No ratings yet
Lecture 3
14 pages
Central Tendency Variation Outliers
No ratings yet
Central Tendency Variation Outliers
59 pages
03 Numerical Description
No ratings yet
03 Numerical Description
52 pages
Session # 3 and 4
No ratings yet
Session # 3 and 4
11 pages
Statistics For Business and Economics: Anderson Sweeney Williams
No ratings yet
Statistics For Business and Economics: Anderson Sweeney Williams
56 pages
Statistics For Business and Economics (13e) : John Loucks
No ratings yet
Statistics For Business and Economics (13e) : John Loucks
43 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
27 pages
Measusres of Locations
No ratings yet
Measusres of Locations
52 pages
1 Introduction
No ratings yet
1 Introduction
44 pages
MEASURES OF CENTRAL TENDENCY (Measures of Location)
No ratings yet
MEASURES OF CENTRAL TENDENCY (Measures of Location)
46 pages
Chapter 3, Part A Descriptive Statistics: Numerical Measures
No ratings yet
Chapter 3, Part A Descriptive Statistics: Numerical Measures
41 pages
1990 Volvo 740 Wiring Diagrams
80% (5)
1990 Volvo 740 Wiring Diagrams
14 pages
Business Statistics (BUSA 3101) Dr. Lari H. Arjomand Lariarjomand@clayton - Edu
No ratings yet
Business Statistics (BUSA 3101) Dr. Lari H. Arjomand Lariarjomand@clayton - Edu
57 pages
Statistics For Business and Economics: Anderson Sweeney Williams
No ratings yet
Statistics For Business and Economics: Anderson Sweeney Williams
34 pages
Numerical Measures: Bf1206-Business Mathematics SEMESTER 2 - 2016/2017
No ratings yet
Numerical Measures: Bf1206-Business Mathematics SEMESTER 2 - 2016/2017
25 pages
HPE6-A88 HPE Aruba Networking ClearPass Exam Free Dumps
No ratings yet
HPE6-A88 HPE Aruba Networking ClearPass Exam Free Dumps
10 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
Chapter 3, Part A Descriptive Statistics: Numerical Measures
No ratings yet
Chapter 3, Part A Descriptive Statistics: Numerical Measures
7 pages
Measures of CeMEASURES OF CENTRAL TENDENCY - Pptntral Tendency
No ratings yet
Measures of CeMEASURES OF CENTRAL TENDENCY - Pptntral Tendency
47 pages
Chapter-3: Statistical Analysis
No ratings yet
Chapter-3: Statistical Analysis
56 pages
FCI ANSIFCI 70-3 - Standard For Regulator Seat Leakage Testing
No ratings yet
FCI ANSIFCI 70-3 - Standard For Regulator Seat Leakage Testing
5 pages
Kenwood TRC 80 User Manual PDF
No ratings yet
Kenwood TRC 80 User Manual PDF
33 pages
4HANA QM Basic Configuration Guide
No ratings yet
4HANA QM Basic Configuration Guide
80 pages
05-1-26am5 1995 MYRecall R493 PDF
No ratings yet
05-1-26am5 1995 MYRecall R493 PDF
18 pages
Foundation Load (Reactions) Data FOR 45 M Diameter Thickener
No ratings yet
Foundation Load (Reactions) Data FOR 45 M Diameter Thickener
88 pages
Bentinho Massaro - 3 Main Teachings PDF
No ratings yet
Bentinho Massaro - 3 Main Teachings PDF
1 page
Natural Science and History Museum
No ratings yet
Natural Science and History Museum
24 pages
Ptu PHD Thesis Format
100% (3)
Ptu PHD Thesis Format
8 pages
Fundamentals of Mathematics Unit 2 - V1
No ratings yet
Fundamentals of Mathematics Unit 2 - V1
21 pages
John Deere 4320 Tractor Operator's Manual (C)
No ratings yet
John Deere 4320 Tractor Operator's Manual (C)
84 pages
Block Diagram
No ratings yet
Block Diagram
6 pages
Raptor 2024
No ratings yet
Raptor 2024
8 pages
Project of Smart Bin
No ratings yet
Project of Smart Bin
13 pages
EG-EM1 Manual
No ratings yet
EG-EM1 Manual
4 pages
Maths Links 8c Homework Book Answers
100% (1)
Maths Links 8c Homework Book Answers
4 pages
Python ch4
No ratings yet
Python ch4
23 pages
Cummins 220 KW
No ratings yet
Cummins 220 KW
7 pages
Unit 8
No ratings yet
Unit 8
4 pages
Candidate Privacy
No ratings yet
Candidate Privacy
6 pages
An 120
No ratings yet
An 120
6 pages
Aplikasi Ujian Online Masuk Universitas Merdeka Madiun Berbasis Android
No ratings yet
Aplikasi Ujian Online Masuk Universitas Merdeka Madiun Berbasis Android
12 pages
Website SEO Adudit Report Thecopycreators
No ratings yet
Website SEO Adudit Report Thecopycreators
21 pages
6 BSTs and AVL Trees
No ratings yet
6 BSTs and AVL Trees
12 pages
Flyer D-Volt 202407
No ratings yet
Flyer D-Volt 202407
2 pages
Chapter 5 Conduction Shape Factor
No ratings yet
Chapter 5 Conduction Shape Factor
10 pages
The Effect of Controlled Permeable Formwork Liner On The Mechanical Properties of Concrete
No ratings yet
The Effect of Controlled Permeable Formwork Liner On The Mechanical Properties of Concrete
11 pages
Cover Letter Qatar
No ratings yet
Cover Letter Qatar
1 page
Innovating HRM in The Local Government - The Northern Samar Experience - BATULA, FLORENCIO A
No ratings yet
Innovating HRM in The Local Government - The Northern Samar Experience - BATULA, FLORENCIO A
1 page
Mastering OpenCV with Python: Use NumPy, Scikit, TensorFlow, and Matplotlib to learn Advanced algorithms for Machine Learning through a set of Practical Projects
From Everand
Mastering OpenCV with Python: Use NumPy, Scikit, TensorFlow, and Matplotlib to learn Advanced algorithms for Machine Learning through a set of Practical Projects
Ayush Vaishya
No ratings yet
Socially Strong, Emotionally Secure: 50 Activities to Promote Resilience in Young Children
From Everand
Socially Strong, Emotionally Secure: 50 Activities to Promote Resilience in Young Children
Nefertiti Bruce
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.