RM Final Slides
RM Final Slides
and
Sample Size Determination
Enumeration:
Population Census
Definition
The assignment of value(s) to a population
parameter based on a value of the
corresponding sample statistic is called
estimation.
Select a sample.
Collect the required information from the
members of the sample.
Calculate the value of the sample statistic.
Assign value(s) to the corresponding
population parameter.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
POINT AND INTERVAL ESTIMATES
A Point Estimate
An Interval Estimate
Definition
The value of a sample statistic that is used
to estimate a population parameter is
called a point estimate.
Definition
In interval estimation, an interval is
constructed around the point estimate,
and it is stated that this interval is likely to
contain the corresponding population
parameter.
x z x
where x = / n
The value of z used here is obtained from the
standard normal distribution table (Table IV of
Appendix C) for the given confidence level.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
ESTIMATION OF A POPULATION MEAN:
KNOWN
Definition
The margin of error for the estimate
for μ, denoted by E, is the quantity that is
subtracted from and added to the value of
x to obtain a confidence interval for μ.
Thus,
E z x
a)
n = 25, x = $145, and σ = $35
35
x $7.00
n 25
Point estimate of μ = x = $145
2.20
x .11
n 400
x z x 8 2.58(.11) 8 .28
7.72 to 8.28 minutes
z 2 2
n 2
E
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 8-3
An alumni association wants to estimate
the mean debt of this year’s college
graduates. It is known that the population
standard deviation of the debts of this
year’s college graduates is $11,800. How
large a sample should be selected so that
the estimate with a 99% confidence level is
within $800 of the sample mean?
z 2 2 (2.58)2 (11,800)2
n 2
2
1448.18 1449
E (800)
Thus, the required sample size is 1449.
E ts x
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 8-5
Dr. Moore wanted to estimate the mean
cholesterol level for all adult men living in
Hartford. He took a sample of 25 adult men from
Hartford and found that the mean cholesterol level
for this sample is 186 mg/dL with a standard
deviation of 12 mg/dL. Assume that the
cholesterol levels for all adult men in Hartford are
(approximately) normally distributed. Construct a
95% confidence interval for the population mean
μ.
s 300
sx $37.50
n 64
pˆ qˆ
s pˆ
n
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
ESTIMATION OF A POPULATION
PROPORTION: LARGE SAMPLES
Confidence Interval for the Population
Proportion, p
pˆ zs pˆ
The value of z used here is obtained from the
standard normal distribution table for the given
confidence level, and s pˆ pˆ qˆ/n. The term zs
p̂
is called the margin of error, E.
a)
Point estimate of p = p̂ = .44
ˆˆ
pq (.25)(.75)
spˆ .00883699
n 2401
The value of z for .97 / 2 = .4850 is 2.17.
ˆˆ
z pq 2
n 2
E
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
DETERMINING THE SAMPLE SIZE FOR
THE ESTIMATION OF PROPORTION
In case the values of p̂ and q̂ are not known
ˆ ˆ (1.96)2 (.07)(.93)
z 2 pq
n 2
E (.02)2
(3.8416)(.07)(.93)
625.22 626
.0004
Definition
A null hypothesis is a claim (or statement)
about a population parameter that is
assumed to be true until it is declared false.
Definition
An alternative hypothesis is a claim about
a population parameter that will be true if
the null hypothesis is false.
Definition
A Type I error occurs when a true null
hypothesis is rejected. The value of α
represents the probability of committing
this type of error; that is,
α = P(H0 is rejected | H0 is true)
The value of α represents the significance
level of the test.
Definition
A Type II error occurs when a false null
hypotheses is not rejected. The value of β
represents the probability of committing a Type II
error; that is,
β = P (H0 is not rejected | H0 is false)
The value of 1 – β is called the power of the
test. It represents the probability of not making
a Type II error.
Definition
A two-tailed test has rejection regions in
both tails, a left-tailed test has the
rejection region in the left tail, and a
right-tailed test has the rejection region
in the right tail of the distribution curve.
Step 3: α = .02
The ≠ sign in the alternative hypothesis
indicates that the test is two-tailed
Area in each tail = α / 2= .02 / 2 = .01
The z values for the two critical points are
-2.33 and 2.33
Step 1: H0 : μ ≥ $300,000
H1 : μ < $300,000
Step 2: The population standard deviation
σ is known, the sample size is small (n <
30), but the population distribution is
normal. Consequently, we will use the
normal distribution to perform the test.
Step 3: α = .025
The < sign in the alternative hypothesis
indicates that the test is left-tailed
Area in the left tail = α = .025
The critical value of z is -1.96
80,000
x $16,000
n 25
x 288,000 300,000
z .75
x 16,000
Test Statistic
The value of the test statistic t for the
sample mean x is computed as
x s
t where s x
sx n
Step 1: H0 : μ = 12.5
H1 : μ ≠ 12.5
Step 2: The population standard deviation
σ is not known, the sample size is small (n
< 30), and the population is normally
distributed. Consequently, we will use the
t distribution to find the p-value for the
test.
and df = n – 1 = 18 – 1 = 17
.02 < p-value < .05
Step 1: H0 : μ ≥ 65
H1 : μ < 65
Step 2: The population standard deviation
σ is not known and the sample size is large
(n > 30). Consequently, we will use the t
distribution to find the p-value for the test.
and df = n – 1 = 45 – 1 = 44
p-value < .001
Step 1: H0 : μ = 12.5
H1 : μ ≠ 12.5
Step 2: The population standard deviation
σ is not known, the sample size is small (n
< 30), and the population is normally
distributed. Consequently, we will use the
t distribution to perform the test.
Step 1: H0 : μ = 22
H1 : μ > 22
Step 2: The population standard deviation
σ is not known and the sample size is large
(n > 30). Consequently, we will use the t
distribution to perform the test.
Step 1: H0 : p = .81
H1 : p ≠ .81
Step 2: To check whether the sample is
large, we calculate the values of np and nq:
np = 1600(.81) = 1296 > 5
nq = 1600(.19) = 304 > 5
Consequently, we will use the normal
distribution to find the p-value for this test.
pq (.81)(.19)
pˆ .00980752
n 1600
pˆ p .83 .81
z 2.04
pˆ .00980752
pq (.04)(.96)
pˆ .01385641
n 200
pˆ p .06 .04
z 1.44
pˆ .01385641
p-value = .0749
Step 1: H0 : p = .81
H1 : p ≠ .81
Step 2: To check whether the sample is
large, we calculate the values of np and nq:
np = 1600(.81) = 1296 > 5
nq = 1600(.19) = 304 > 5
Consequently, we will use the normal
distribution to make the test.
pq (.81)(.19)
pˆ .00980752
n 1600
pˆ p .83 .81
z 2.04
pˆ .00980752
Step 1: H0 : p ≥ .90
H1 : p < .90
Step 2: To check whether the sample is
large, we calculate the values of np and nq:
np = 150(.90) = 135 > 5
nq = 150(.10) = 15 > 5
Consequently, we will use the normal
distribution to make the test.
pq (.90)(.10)
pˆ .02449490
n 150
pˆ p .86 .90
z 1.63
pˆ .02449490
Screen 9.3
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Excel
Definition
1. The F distribution is continuous and
skewed to the right.
2. The F distribution has two numbers of
degrees of freedom: df for the numerator
and df for the denominator.
3. The units of an F distribution, denoted F,
are nonnegative.
Definition
ANOVA is a procedure used to test the null
hypothesis that the means of three or more
populations are equal.
T T2
T 2
( x )
2 2
SSB 1
...
2 3
n1 n2 n3 n
T T T
2 2 2
SSW x ...
2 1 2 3
n1 n2 n3
∑x = T1 + T2 + T3 = 324+369+388 = 1081
n = n1 + n2 + n3 = 5+5+5 = 15
Σx² = (48)² + (73)² + (51)² + (65)² +
(87)² + (55)² + (85)² + (70)² +
(69)² + (90)² + (84)² + (68)² +
(95)² + (74)² + (67)²
= 80,709
SSB 432.1333
MSB 216.0667
k 1 3 1
SSW 2372.8000
MSW 197.7333
nk 15 3
MSB 216.0667
F 1.09
MSW 197.7333
Step 4 & 5:
The value of the test statistic F = 1.09
It is less than the critical value of F = 6.93
It falls in the nonrejection region
Hence, we fail to reject the null hypothesis
We conclude that the means of the three
population are equal.
Step 1:
H0: μ1 = μ2 = μ3 = μ4 (The mean number of
customers served per hour by each of the
four tellers is the same)
H1: Not all four population means are equal
Step 2:
Because we are testing for the equality of
four means for four normally distributed
populations, we use the F distribution to
make the test.
Step 3:
α = .05.
A one-way ANOVA test is always right-
tailed.
Area in the right tail is .05.
df for the numerator = k – 1 = 4 – 1 = 3
df for the denominator = n – k = 22 – 4
= 18
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Figure 12.4 Critical value of F for df = (3, 18) and
α = .05.
n1 n2 n3 n4 n
(108)2 (87)2 (93)2 (110)2 (398)2
255.6182
5 6 6 5 22
T12 T22 T32 T42
SSW x 2
1n n2 n3 n 4
SSB 255.6182
MSB 85.2061
k 1 4 1
SSW 158.2000
MSW 8.7889
nk 22 4
MSB 85.2061
F 9.69
MSW 8.7889
Step 5:
The value for the test statistic F = 9.69
It is greater than the critical value of F = 3.16
It falls in the rejection region
Consequently, we reject the null
hypothesis
We conclude that the mean number of
customers served per hour by each of the
four tellers is not the same.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
TI-84
Screen 12.4
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Excel