Chapter 4A
Chapter 4A
Mean
Median
Mode
Probability Density Function
2
1 x−
1 −
2
f ( x) = e
2
where
µ = Mean of the normal random variable x
= Standard deviation
π = 3.1415 . . .
e = 2.71828 . . .
P(x < a) is obtained from a table of normal
probabilities
Effect of Varying
Parameters ( & )
Normal Distribution
Probability
Probability is
d
area under
curve!
P(c x d) =
c
f (x)dx ?
f(x )
x
c d
Standard Normal Distribution
The standard normal distribution is a normal
distribution with µ = 0 and = 1. A random
variable with a standard normal distribution,
denoted by the symbol z, is called a standard
normal random variable.
The Standard Normal Table:
P(0 < z < 1.96)
Standardized Normal
Probability Table (Portion)
Z .04 .05 .06 =1
1.8 .4671 .4678 .4686
.4750
1.9 .4738 .4744 .4750
2.0 .4793 .4798 .4803
= 0 1.96 z
2.1 .4838 .4842 .4846 Shaded area
Probabilities exaggerated
The Standard Normal Table:
P(–1.26 z 1.26)
–1.26 1.26 z
=0
Shaded area exaggerated
The Standard Normal Table:
P(z > 1.26)
.5000
P(z > 1.26)
= .5000 – .3962
.3962
= .1038
1.26 z
=0
The Standard Normal Table:
P(–2.78 z –2.00)
Standardized Normal Distribution
=1
.4973 P(–2.78 ≤ z ≤ –2.00)
= .4973 – .4772
.4772 = .0201
–2.78 –2.00 z
=0
Shaded area exaggerated
The Standard Normal Table:
P(z > –2.13)
–2.13 z
=0
Shaded area exaggerated
Non-standard Normal
Distribution
Normal distributions differ by Each distribution would
mean & standard deviation. require its own table.
f(x)
x That’s an infinite
number of tables!
Converting a Normal Distribution to
a Standard Normal Distribution
x−
z=
Normal Standardized Normal
Distribution Distribution
=1
x = 0 z
One table!
Finding a Probability Corresponding to a
Normal Random Variable
1. Sketch normal distribution, indicate mean, and shade
the area corresponding to the probability you want.
2. Convert the boundaries of the shaded area from x
values to standard normal random variable z
x− µ
z=
Show the z values under corresponding x values.
3. Use Table II in Appendix D to find the areas
corresponding to the z values. Use symmetry when
necessary.
Non-standard Normal μ = 5,
σ = 10: P(5 < x < 6.2)
x− 6.2 − 5
z= = = .12
Normal 10 Standardized Normal
Distribution Distribution
= 10 =1
.0478
= 5 6.2 x = 0 .12 z
Shaded area exaggerated
Non-standard Normal μ = 5,
σ = 10: P(3.8 x 5)
x− 3.8 − 5
z= = = −.12
10
Normal Standardized Normal
Distribution Distribution
= 10 =1
.0478
3.8 = 5 x -.12 = 0 z
Shaded area exaggerated
Non-standard Normal μ = 5,
σ = 10: P(2.9 x 7.1)
x− 2.9 − 5 x− 7.1 − 5
z= = = −.21 z= = = .21
10 10
Normal Standardized Normal
Distribution Distribution
= 10 =1
.1664
.0832 .0832
.1179
.0347
.0832
=0 ?
.31 z 0.2 .0793 .0832 .0871
= 5 8.1
? x = 0 .31 z
x = + z = 5 + (.31)(10 ) =
Shaded areas exaggerated
Descriptive Methods for
Assessing Normality (optional)
Determining Whether the Data
Are from an Approximately
Normal Distribution
1. Construct either a histogram or stem-and-leaf
display for the data and note the shape of the
graph. If the data are approximately normal,
the shape of the histogram or stem-and-leaf
display will be similar to the normal curve.
Determining Whether the Data
Are from an Approximately
Normal Distribution
2. Compute the intervals x s, x 2s, and x 3s,
and determine the percentage of
measurements falling in each. If the data are
approximately normal, the percentages will be
approximately equal to 68%, 95%, and 100%,
respectively; from the Empirical Rule (68%,
95%, 99.7%).
Determining Whether the Data
Are from an Approximately
Normal Distribution
3. Find the interquartile range, IQR, and standard
deviation, s, for the sample, then calculate the
ratio IQR/s. If the data are approximately
normal, then IQR/s ≈ 1.3.
IQR Q3 − Q1
=
s s
Determining Whether the Data
Are from an Approximately
Normal Distribution
4. Examine a normal
probability plot for the
Expected z–score
data. If the data are
approximately
normal, the points will
fall (approximately)
on a straight line.
Observed value
Normal Probability Plot
A normal probability plot for a data set is a
scatterplot with the ranked data values on one axis
and their corresponding expected z-scores from a
standard normal distribution on the other axis.
[Note: Computation of the expected standard
normal z-scores are beyond the scope of this text.
Therefore, we will rely on available statistical
software packages to generate a normal
probability plot.]
Appendix 2: Sampling
Distribution
Parameter & Statistic
Mean x
Standard
Deviation s
Variance s2 2
Binomial ^
p p
Proportion
Sampling Distribution
x i
1.0 + 1.5 + ... + 4.0
X = i =1
= = 2.5
N 16
Comparison
Population Sampling Distribution
P(x) P(x)
.3 .3
.2 .2
.1 .1
.0 x
.0 x
1 2 3 4 1.0 1.5 2.0 2.5 3.0 3.5 4.0
= 2.5 x = 2.5
The Sampling Distribution
of a Sample Mean and the
Central Limit Theorem
Properties of the Sampling
Distribution of x
◼ Dispersion
= 50
x = x
n Sampling Distribution
◼ Sampling with n=4 n =16
replacement x = 5 x = 2.5
x- = 50 x
Standardizing the Sampling
Distribution of x
x − x x−
z= =
x
Sampling n Standardized Normal
Distribution Distribution
x =1
x x =0 z
Central Limit Theorem
Consider a random sample of n observations
selected from a population (any probability
distribution) with mean μ and standard deviation .
Then, when n is sufficiently large, the sampling
distribution of x will be approximately a normal
distribution with mean x = and standard
deviation x = n . The larger the sample size,
the better will be the normal approximation to the
sampling distribution of x .
Central Limit Theorem
As sample x =
n
size gets
sampling
large
distribution
enough
becomes almost
(n 30) ...
normal.
x = x
Central Limit Theorem
x
x =
The Sampling Distribution
of the Sample Proportion
Sample Proportion
p Proportion; percentage
fraction; rate Qualitative
Estimates
Confidence Confidence
limit (lower) limit (upper)
1.96
x 1.96 x = x
n
◼ For large samples, the fact that sigma is
unknown→ The sample standard deviation s
provides a very good approximation to sigma.
Confidence Interval
If sample measurements yield a value of x that falls
between the two lines on either side of µ, then the
interval x 1.96 x will contain µ.
95% Confidence Level
If our confidence level is 95%, then in the long run,
95% of our confidence intervals will contain µ and
5% will not.
To choose a different confidence coefficient we
increase or decrease the area (call it ) assigned to
the tails. If we place /2 in each tail
(
x z 2 s / n )
where s is the sample standard deviation.
Required Conditions
Standard
Normal
Bell-Shaped
t (df = 13)
Symmetric
‘Fatter’ Tails
t (df = 5)
z
t
0
t - Table
t-value
If we want the t-value with an area of .025 to its
right and 4 df, we look in the table under the
column t.025 for the entry in the row corresponding
to 4 df. This entry is t.025 = 2.776. The
corresponding standard normal z-score is z.025 =
1.96.
Small-Sample
Confidence Interval for µ
s
x t 2
n
ˆˆ
pq ˆˆ
pq 32
pˆ − Z /2 p pˆ + Z /2 pˆ = = 0.08
n n 400
.053 p .107
Thinking Challenge
You’re a production
manager for a newspaper.
You want to find the %
defective. Of 200
newspapers, 35 had
defects. What is the 90%
confidence interval estimate
of the population
proportion defective?
Problem
Adjusted (1 – )100% Confidence
Interval for a Population Proportion, p
p (1 − p )
p z 2
n+4
x+2
p=
where n + 4 is the adjusted sample proportion
of observations with the characteristic of interest, x
is the number of successes in the sample, and n is
the sample size.
Determining the Sample Size
Sample size and C.I.
Sampling Error
In general, we express the reliability associated
with a confidence interval for the population mean
µ by specifying the sampling error within which
we want to estimate µ with 100(1 –)% confidence.
The sampling error (denoted SE), then, is equal to
the half-width of the confidence interval.
Sample Size Determination for 100(1 – )
% Confidence Interval for µ
z /2
2
n=
SE
Sample Size Example
What sample size is needed to be 90% confident
the mean is within 5? A pilot study suggested
that the standard deviation is 45.
(1.645) (45)
2 2
(z 2 )
2 2
n= = = 219.2 220
(SE) 2 (5)
2
Sample Size Determination for 100(1 – )
% Confidence Interval for p
(Z 2 ) ( pq )
2
(1.645 ) (.5 .5 )
2
n= = = 3006.69 3007
(SE) 2 (.015 )2
Thinking Challenge
You work in Human Resources at Merrill Lynch.
You plan to survey employees to find their
average medical expenses. You want to be
95% confident that the sample mean is within ±
$50.
A pilot study showed that was about $400.
What sample size do you use?
Confidence Interval for a
Population Variance
Confidence Interval for a
Population Variance
Conditions Required for a Valid
Confidence Interval for 2