Chapter 11
Chapter 11
Point Estimate
One way to obtain information about a population mean is to estimate it by a sample mean
x, as illustrated in the next example.
EXAMPLE: The U.S. Census Bureau publishes annual price figures for new mobile homes in
Manufactured Housing Statistics. The figures are obtained from sampling, not from a census.
A simple random sample of 36 new mobile homes yielded the prices, in thousands of dollars,
shown in the Table below. Use the data to estimate the population mean price, , of all new
mobile homes.
Solution: We estimate the population mean price, , of all new mobile homes by the sample
mean price, x, of the 36 new mobile homes sampled. From the Table above,
X
xi
2278
x=
=
= 63.28
n
36
Interpretation: Based on the sample data, we estimate the mean price, , of all new mobile
homes to be approximately $63.28 thousand, that is, $63,280.
An estimate of this kind is called a point estimate for because it consists of a single number,
or point.
Confidence-Interval Estimate
As we know, a sample mean is usually not equal to the population mean; generally, there is
sampling error. Therefore, we should accompany any point estimate of with information
that indicates the accuracy of that estimate. This information is called a confidence-interval
estimate for , which we introduce in the next example.
EXAMPLE: Consider again the problem of estimating the (population) mean price, , of all new
mobile homes by using the sample data in the Table above. Lets assume that the population
standard deviation of all such prices is $7200.
(a) Identify the distribution of the variable x, that is, the sampling distribution of the sample
mean for samples of size 36.
(b) Use part (a) to show that 95% of all samples of 36 new mobile homes have the property
that the interval from x 2.4 to x + 2.4 contains .
(c) Use part (b) and the sample data in the Table above to find a 95% confidence interval for
, that is, an interval of numbers that we can be 95% confident contains .
Solution:
(a) A histogram of the price data in the Table above shows that the prices of new mobile
homes are normally distributed. Because n = 36, = 7.2, and prices of new mobile homes are
normally distributed, it follows that
we
dont know)
x = (which
x = / n = 7.2/ 36 = 1.2
x is normally distributed
In other words, for samples of size 36, the variable x is normally distributed with mean and
standard deviation 1.2.
(b) The 95 part of the 68-95-99.7 rule states that, for a normally distributed variable, 95% of
all possible observations lie within two standard deviations to either side of the mean. Applying
this rule to the variable x and referring to part (a), we see that 95% of all samples of 36 new
mobile homes have mean prices within 2 1.2 = 2.4 of . Equivalently, 95% of all samples of
36 new mobile homes have the property that the interval from x 2.4 to x + 2.4 contains .
(c) Because we are taking a simple random sample, each possible sample of size 36 is equally
likely to be the one obtained. From part (b), 95% of all such samples have the property that
the interval from x 2.4 to x + 2.4 contains . Hence, chances are 95% that the sample we
obtain has that property. Consequently, we can be 95% confident that the sample of 36 new
mobile homes whose prices are shown in the Table above has the property that the interval
from x 2.4 to x + 2.4 contains . For that sample, x = 63.28, so
x 2.4 = 63.28 2.4 = 60.88 and x + 2.4 = 63.28 + 2.4 = 65.68
Interpretation: We can be 95% confident that the mean price, , of all new mobile homes is
somewhere between $60,880 and $65,680.
NOTE: A confidence interval for a population mean depends on the sample mean, x, which
in turn depends on the sample selected. For example, suppose that the prices of the 36 new
mobile homes sampled were as shown in the Table below instead of as in the Table above.
More generally, we can say that 100(1)% of all samples of size n have means within z/2 / n
as depicted in Figure (b) above. Equivalently, we can say that 100(1 )% of all samples of
size n have the property that the interval from
x z/2
n
to x + z/2
n
contains .
EXAMPLE: The Bureau of Labor Statistics collects information on the ages of people in the
civilian labor force and publishes the results in Employment and Earnings. Fifty people in the
civilian labor force are randomly selected; their ages are displayed in the Table below.
Find a 95% confidence interval for the mean age, , of all people in the civilian labor force.
Assume that the population standard deviation of the ages is 12.1 years.
EXAMPLE: The Bureau of Labor Statistics collects information on the ages of people in the
civilian labor force and publishes the results in Employment and Earnings. Fifty people in the
civilian labor force are randomly selected; their ages are displayed in the Table below.
Find a 95% confidence interval for the mean age, , of all people in the civilian labor force.
Assume that the population standard deviation of the ages is 12.1 years.
Solution: Because the sample size is 50, which is large, and the population standard deviation
is known, we can use the Procedure above to find the required confidence interval.
Step 1: For a confidence level of 1 , use Table I to find z/2 .
We want a 95% confidence interval, so = 1 0.95 = 0.05. From Table I, z/2 = z0.05/2 =
z0.025 = 1.96.
Step 2: The confidence interval for is from
x z/2
n
to
x + z/2
n
We know = 12.1, n = 50, and, from Step 1, z/2 = 1.96. To compute x for the data in the
Table above, we apply the usual formula:
X
xi
1819
=
= 36.4
x=
n
50
to one decimal place. Consequently, a 95% confidence interval for is from
12.1
36.4 1.96
50
12.1
to 36.4 + 1.96
50
or 33.0 to 39.8.
Interpretation: We can be 95% confident that the mean age, , of all people in the civilian
labor force is somewhere between 33.0 years and 39.8 years.
EXAMPLE: Consider again the problem of estimating the mean age, , of all people in the
civilian labor force.
(a) Determine the sample size needed in order to be 95% confident that is within 0.5 year of
the estimate, x. Recall that = 12.1 years.
(b) Find a 95% confidence interval for if a sample of the size determined in part (a) has a
mean age of 38.8 years.
Solution:
(a) To find the sample size, we use the Formula above. We know that = 12.1 and E = 0.5.
The confidence level is 0.95, which means that = 0.05 and z/2 = z0.025 = 1.96. Thus
n=
z
/2
2
1.96 12.1
0.5
2
= 2249.79
/ n
s/ n
which is a value of a random variable having the t-distribution. More specifically, this distribution is called the Student t-distribution or Students t-distribution, as it was first
developed by a statistician, W.S.Gosset, who published his work under the pen name Student.
There is a different t-distribution for each sample size. We identify a particular t-distribution
by its number of degrees of freedom (df). For the studentized version of x, the number of
degrees of freedom is 1 less than the sample size, which we indicate symbolically by df = n1.
A variable with a t-distribution has an associated curve, called a t-curve. Although there is
a different t-curve for each number of degrees of freedom, all t-curves are similar and resemble
the standard normal curve, as illustrated in the Figure above (right).
Percentages (and probabilities) for a variable having a t-distribution equal areas under the
variables associated t-curve. For our purposes, one of which is obtaining confidence intervals
for a population mean, we dont need a complete t-table for each t-curve; only certain areas
will be important. Table II is sufficient for our purposes.
EXAMPLE: For a t-curve with 13 degrees of freedom, determine t0.05 ; that is, find the t-value
having area 0.05 to its right, as shown the Figure below.
EXAMPLE: For a t-curve with 13 degrees of freedom, determine t0.05 ; that is, find the t-value
having area 0.05 to its right, as shown in Figure (a) below.
Solution: To find the t-value in question, we use Table II, a portion of which is given below:
The number of degrees of freedom is 13, so we first go down the outside columns, labeled df,
to 13. Then, going across that row to the column labeled t0.05 , we reach 1.771. This number
is the t-value having area 0.05 to its right, as shown in Figure (b) above. In other words, for a
t-curve with df = 13, t0.05 = 1.771.
EXAMPLE: The Federal Bureau of Investigation (FBI) compiles data on robbery and property crimes and publishes the
information in Population-at-Risk Rates and Selected Crime
Indicators. A simple random sample of pickpocket offenses
yielded the losses, in dollars, shown in the Table on the right.
Use the data to find a 95% confidence interval for the mean
loss, , of all pickpocket offenses.
8
EXAMPLE: The Federal Bureau of Investigation (FBI) compiles data on robbery and property crimes and publishes the
information in Population-at-Risk Rates and Selected Crime
Indicators. A simple random sample of pickpocket offenses
yielded the losses, in dollars, shown in the Table on the right.
Use the data to find a 95% confidence interval for the mean
loss, , of all pickpocket offenses.
Solution: Because the sample size, n = 25, is moderate, we first need to consider questions of
normality. To do that, we constructed a histogram that reveals that indeed we have a roughly
normal population. So, we can apply the above Procedure to find the confidence interval.
Step 1: For a confidence level of 1 , use Table II to find t/2 with df= n 1,
where n is the sample size.
We want a 95% confidence interval, so = 10.95 = 0.05. For n = 25, we have df= 251 = 24.
From Table II, t/2 = t0.05/2 = t0.025 = 2.064.
Step 2: The confidence interval for is from
s
x t/2
n
to
s
x + t/2
n
From Step 1, t/2 = 2.064. Applying the usual formulas for x and s to the data in the Table
above gives x = 513.32 and s = 262.23. So a 95% confidence interval for is from
262.23
513.32 2.064
25
262.23
to 513.32 + 2.064
25
or 405.07 to 621.57.
Interpretation: We can be 95% confident that the mean loss of all pickpocket offenses is
somewhere between $405.07 and $621.57.
10
11