S8 Estimate
S8 Estimate
Confidence Interval
Estimation
David Chow
Oct 2021
1
Learning Objectives
2
Basic Concepts
A point estimate is a single number
A confidence interval is an interval estimate -- indicates variability
3
Basic Concepts
The general formula for all confidence intervals (CI) is:
Two main
applications:
4
Basic Concepts
5
Estimating μ
(σ Known)
6
Estimating μ
Assume population standard deviation σ is known
Also assume n is large enough (n > 30)
or a normally distributed population
These assumptions ensure that ____
σ
XZ
n
7
Critical Values of Z
Z, also written as Zα/2, is the standardized normal distribution
critical value for a probability of ____ in each tail
Z is associated with a confidence level of (1-α) %
Critical Z-value(s)
8
Critical Values of Z
More Examples
Z0.05 = ___
Z0.005 = ___
9
Eg: Length of A4 Paper
A paper producer wants to check if the output
has the correct mean length of 11 inches
Find the 95% CI for the mean paper length
• Sample info
• Sample size n = 100
• Sample mean = 10.998 in
• is known to be 0.02
10
Eg: Length of A4 Paper
The 95% confidence interval is given by:
Conclusion?
11
Eg: Mean Resistance
12
Eg: Mean Resistance (Continued)
Recall
σ
XZ
n
13
Choosing Confidence Level
Confidence level is our choice
Common picks: 90%, 95% or 99%
14
Choosing Confidence Level
Confidence level is our choice
How to choose?
15
α, Confidence Intervals &
Sampling Distribution
/2 1 /2
x
μx μ
x1
x2 (1-) x 100% of
intervals constructed
contain μ;
() x 100% do not
Confidence Intervals
16
Interpreting Confidence Intervals
Repeated Sampling
If we select many different samples of size n from a population,
and construct a 95% confidence interval for each sample
Questions
17
Estimating μ
(σ Unknown)
Question
18
Estimating μ (Unknown σ)
Naturally, we use sample SD (S) to replace σ
Extra uncertainty introduced by S
S varies from sample to sample
The random variable t is defined as
19
t Distribution
The t distribution is a
family of curves
t distributions are:
Bell-shaped and symmetric
Characterized by
degrees of freedom
In this chapter,
A specific t distribution is
df = n-1
associated with a particular df
20
t Distribution and d.f.
•• tt distributions
distributions look
look like
like the
the Z
Z curve
curve
•• tt distributions
distributions are
are generally
generally flatter
flatter than
than Z
Z
•• The
The larger
larger the
the sample
sample size,
size, the
the more
more the
the tt distribution
distribution looks
looks like
like ZZ
•• The
The difference
difference between
between tt distribution
distribution and
and ZZ disappears
disappears when
when ____
____
21
Critical Values of t
Suppose n = 21, and = 0.10
Then df = ____
upper-tail area = ____
d.f. = 20
/2 = 0.05
22
Estimating μ (Unknown σ)
Assumptions: With an unknown σ, we need
(1) a normal population, or For a highly-skewed population, or when
(2) n ≥ 30 outliers exist, n shall be 50 or more
If the population is fairly symmetric,
n = 15 can be enough
CI estimate: X t S
n -1
n
where t, also written as tα/2,n-1, is the critical t-value
23
Eg: Mean Age of Retirement
A random sample of 25 retirees has mean age = 50 and std = 8
Find the 95% confidence interval for
Assume ____
S 8
X t/2, n -1 50 (2.0639)
n 25
(46.698 , 53.302)
24
Critical t Values Again
Critical t values depend on two elements:
The confidence level (1- ), and df
What is this
Degrees Area in Upper Tail “2.009”?
27
Confidence Intervals for π
Standard deviation (1 )
is given by σp
n
As π is unknown, we will
p(1 p)
estimate with sample data:
n
28
Confidence Intervals for the
Population Proportion π
The confidence interval for the population proportion is given by:
p(1 p)
pZ
n
where
Z = critical Z-value given the level of confidence
p = sample proportion, n = sample size
29
Example: Vaccinations
1. A random sample of 100 people shows that 25 of them are vaccinated.
Form a 95% confidence interval for the population proportion
2. Compute the 95% confidence interval if n=1000
Interpretation
95% of intervals formed from
samples of size 100 in this manner
will cover the true proportion
Sample Size
If n=1000, margin of error = ____
For n=100, π = 25% ± 8.5%
For n=1000, π = 25% ± 2.7%
30
Sample Size
Determination
31
Sample Size Determination
Recall that sample size (n) affects the margin of error (e)
σ
e is also called sampling error: e Z
n
Z2 σ 2
n
e2
32
Sample Size Determination
Z 2 σ 2 (1.645) 2 (45)2
n 2
2
219.19
e 5
33
Eg: A4 Paper Again
Recall the A4 paper example where = 0.02 and n = 100
The 95% interval estimate is = 10.998 0.00392 inches
Suppose the manufacturer wants to limit the error to 0.003 by
choosing a larger sample. What is n?
ANSWER
ANSWER
34
Sample Size Determination
To find the required sample size for the proportion, you must know:
The critical value Z (from a confidence level of 1-α)
The acceptable sampling error (e), and
The true proportion π
If π is unknown, use the sample value p, or set π = 0.50
(1 ) Solve for Z 2 (1 )
eZ n to get n
n e2
35
Eg: Quality Control
Out of a population of 1,000 light bulbs, we randomly selected
100 of which 30 were defective. What sample size is needed
to be within ± 0.05 with 90% confidence?
(a) As the population proportion is unknown, use the sample value
(b) Now, set π = 0.50 and compare the result with (a)
ANSWER
(b) The required sample size
(a) Z p 1 p
2
1.645 0.3 0.7
2
increases to 271
n
Error 2 0.052
227.3 228 NOTE: The product π (1- π) ranges
from 0 to 0.25. By assuming a value
of 0.25, we are in fact playing safe by
sampling more than necessary.
36
More on the
t Distribution
37
t Distribution
38
Degrees of Freedom
The critical value of t is characterized by two elements:
The confidence level (1- ), and
The degrees of freedom (df)
What is d.f.?
It is the number of observations that are free to vary
after sample mean has been calculated
In this section, df = n-1
39
Degrees of Freedom
Eg: Suppose the mean of 3 numbers is 8.0
Let X1 = 7, X2 = 8
What is X3?
d.f.
You= 2are
…“free” to choose 2 values (X1 and X2),
but the
What doesthird is set for a given mean
it mean?
40
Degrees of Freedom
t Z as n increases
41
Review Questions
A random sample of 100 from a population is selected, and is 600. σ is
known to be 50. To find the 95% CI, we use the ___ table
A) Z, because sigma is known
B) Z, because n is greater than 30
C) t
D) Both (A) and (B)
c. No. Since σ is known and n > 30, we may assume that the
sampling distribution of ____ is approximately normal