We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 7
85 Sampling Distribution of 8? 25
formula
‘The probability that a random sample produces a x* value greater than some
specified value is equal to the area under the eurve to the right of this value. It is
‘customary to let x2 represent the x? value above which we find an area of a. This
is illustrated by the shaded region in Figure 8.7,
° z ,
Figure 8.7: The chi-squared distribution
Table A.5 gives valuos of x2 for various values of «and v. The areas, «, are
the column headings the degrees of freedom, w, ae given in the left column and
the table entries are the x2 values. Henee, the x? value with 7 degrees of freedom,
leaving an area of 0.05 to the right, x25 = 1.067. Owing to lack of symmetry,
wwe must also use the tables to find 2.9, = 2.167 for
Exactly 95% of a chi-squared distribution lies between x3 on, and x3.oa5- A x?
value falling to the right of x3 og, is not likely to occur unless our assumed value of
2? is too small. Similarly, a x* value falling to the let of x2 yy, is unlikely unless
‘our assumed value of 0? is too large. In other words, it is possible to have a x?
value to the left of x2.75 OF to the right of x2 oy when 0” is correct, but if this
should occur, it is more probable that the assumed value of 0 is in error.
Example 8.7: A manufacturer of car batteries guarantees that the batteries will last, on average,
3 years with a standard deviation of 1 year. If ive of these batteries have lifetimes
of 19, 24, 80, 3.5, and 4.2 years, should the manufacturer still be convinced that
‘the batteries have a standard deviation of 1 year? Assume that te battery lifetime
follows a normal distribution
Solution: We first find the sample variance using Theorem 8.1,
2 _ (5)(48.26) ~ (15)?
(4)
= 0.815.
‘Then
(4)(0.815)
7 3.26246
Chapter § Fundamental Sampling Distributions and Data Descriptions
is a value from a chi-squared distribution with 4 degrees of freedom. Since 95%
of the x? values with 4 dogrees of froedom fall between 0.484 and 11.145, the
computed value with ¢? = 1 is reasonable, and therefore the manufactuser has no
zeason to suspect that the standard deviation is other than 1 year. 4
Degrees of Freedom as a Measure of Sample Information
8.6
Recall from Corollary 7.1 in Seetion 7.3 that
> (X= 0?
has a x7-distribution with n degrees of freedom. Note also Theorem 8.4, which
indicates that the random variable
(wast _ ys =X)
‘has a x?distribution with n—1 degrees of freedom, ‘The reader may also recall that
the term degrees of freedom, used in this identical context, is discussed in Chapter
1
As we indicated earlier, the proof of Theorem 8.4 will not be given. However,
‘the reader can view Theorem 8.4 as indicating that when jis not known and one
considers the distribution of
there is 1 less degree of freedom, or a degree of freedom is lost in the estimation
ofp (ie,, when pis replaced by 2). In other words, there are n degrees of fkee-
dom, or independent pieces of infermation, in the random samaple from the normal
distribution, When the data (the values in the sample) are used to compute the
‘mean, there is 1 less degree of freedom im the information used to estimate 0?
t-Distribution
In Section 8.4, we discussed the utility of the Central Limit Theorem. Its applica-
‘ious revolve around inferences on a population mean or the difference between two
population means, Use of the Central Limit Theorem and the normal distribution
is certainly helpful in this context. However, it was assumed that the population
standard deviation is known. This assumption may not be unreasonable in situ-
ations where the engineer is quite familiar with the system or process. However,
in many experimental scenarios, knowledge of o is certainly no more reasonable
than knowledge of the population mean ,., Often, in fact, an estimate of o must
‘be supplied by the same sample information that produced the sample average 2.
As a result, a natural statistic to consider to deal with inferences on jis86 t-Distribution 27
since $ is the sample analog to 0. If the sample size is small, the values of S? flue-
‘uate considerably from sample to sample (sce Exercise 8.43 on page 259) and the
distribution of T deviates appreciably from that of a standard normal distribution.
If the sample size is large enough, say n > 30, the distribution of T does not
differ considerably from the standazd normal. However, for n < 30, itis useful to
deal with the exact distribution of T. In developing the sampling distribution of T,
‘we shall assume that our random sample was selected from a normal population.
‘We can then waite
(X=wioiva)_
Vs /V/(n—1)"
where
hhas a chi-squared distribution with v = n—1 degrees of freedom, In sampling from
normal populations, we can show that X and S? are independent, and consequently
so are Z and V. The following theorem gives the definition of a random variable
T as a function of Z (standard normal) and x, For completeness, the density
function of the t-distribution is given,
Let Z be a standard normal random variable and V a chi-squared random variable
with v degrees of freedom. If Z and V are independent, then the distribution of|
the random variable T, where
‘Theorem 8.5:
Zz
To WR
Js given by the density function
Tie +1/2) By ene
MED Ys crew
This is known as the tdistribution with » degrees of freedom.
ay
From the foregoing and the theorem above we have the following corollary.24s. Chapter § Fundamental Sampling Distributions and Data Descriptions
Tet X;,Xz,...,X, be independent random variables that are all nommal with]
mean j2 and standard deviation 0. Let
Corollary 8.1:
ty, mat
mi
he he ain ase T= if ba atin wh 9 = 1 dee
‘The probability distribution of T was first published in 1908 in a paper written
by W. S. Gosset. At the time, Gosset was employed by an Irish brewery that
prohibited publication of research by members of its staff. To circumvent this re-
striction, he published his work secretly under the name "Student,” Consequently,
the distribution of T'is usually called the Student ¢-distribution or simply the
distribution, In deriving the equation of this distribution, Gosset assumed that the
samples were selected from a normal population. Although this would seem to be a
very restrictive assumption, it can be shown that nonnermal populations possess-
ing nearly bell-shaped distributions will still provide values of T that approximate
the t-distribution very closely.
‘What Does the t-Distribution Look Like?
‘The distribution of Tis similar to the distribution of Z in that they both are
symmetric about a mean of zero. Both distributions are bell shaped, but the t-
distribution is more variable, owing to the fact that the T-values depend on the
fluctuations of two quantities, X and S®, whereas the Z-values depend only on the
changes in X from sample to sample. The distribution of 7 differs from that of Z
in that the variance of T depends on the sample size n and is always greater than
1. Only when the sample size n > o0 will the two distributions become the same.
In Figure 8.8, we show the relationship between a standard normal distribution
(v = ce) and t-distributions with 2 and 5 dogrees of freedom. The percentage
points of the (distribution are given in Table A
Figure 8.8: The (distribution curves for v = 2,5, Figure 8.9: Symmetry property (about 0) of the
and 2. t-distribution,86 t-Distribution 249
It is customary to let fq, represent the ¢-value above which we find an area equal
toa. Hence, the t-value with 10 degrees of freedora leaving an azea of 0.025 to
the right is ¢= 2.298, Since the t-distzibution is symmetric about a mean of zero,
we have (4 = ~ta; that is, the C-value leaving an area of 1 ~ a to the right and
‘therefore an avea of a to the left is equal to the negative t-value that leaves an area,
of ein the right tail of the distuibution (sce Figure 8.9). That is, toes = —toos,
too = —to.1, and so forth.
Example 8.8: The l-value with v = 14 degrees of freedom that leaves an area of 0.025 to the left,
and therefore an area of 0.975 to the right, is
toons = —to.oas = ~2.148. 4
Example 8.9: Find P(—too2s <7 < toon):
Solution: Since tos leaves an area of 0.05 to the right, and —to o2s leaves an area of 0.025
to the left, we find a total area of
1 0.085 ~ 0.025 = 0.925
‘between —ty oar and ty os. Hence
P(to gas <7 < tos) = 0.925, 4
Example 8.10: Find k such that P(k <7 < -1.761)
selected from a normal distribution and =.
0,045 for a random sample of size 15
Teo °
Figure 8.10: The t-values for Example 8.10.
Solution: From Table A.A we note that 1.761 corresponds to ty o5 when v = 14. Therefore,
“too; = 1761. Since k in the original probability statement is to the left of
~toor = -1.761, lot k= —ta. Thon, from Figure 8.10, we have
0.045 = 0.05 — a, or a = 0.005,
Hence, from Table Ad with » = 14,
k= ~lo.905 = ~2.977 and P(-2.977 < T < -L.761) = 0.045. 4250 Chapter § Fundamental Sampling Distributions and Data Descriptions
Exactly 95% of the values of a t-distribution with v = n—1 degrees of freedom,
lie between ~fo.o25 and fo.025. Of course, there are other values that contain 95%
of the distribution, such as —to.2 and to, but these values do not appear in Table
A.A, and furthermore, the shortest possible interval is obtained by choosing t-values
‘that leave exactly the same area in the two tails of our distribution. A t-value that
falls below —to 925 oF above to 25 would tend to make us believe either that a very
rare event has taken place or that our assumption about jis in error. Should this
hhappen, we shall make the the decision that our assumed value of jr is in error
In fact, a f-value falling below —to o: or above ty o1 would provide even stronger
evidence that our assumed value of 1 is quite unlikely. General procedures for
testing claims concerning the value of the parameter will be treated in Chapter
10, A preliminary look into the foundation of these procedue is illustrated by the
following example.
Hxample 8.11: A chemical engineer claims that the population mean yield of a certain batch
‘process is 500 grams per milliliter of raw material. To check this claim he samples
25 batches each month. If the computed (-value falls between —fo.05 and lo 05, he
js satisfied with this claim. What conclusion should he draw from a sample that
hhas a mean 2 = 518 grams per milliliter and a sample standard deviation s = 40
grams? Assume the distribution of yields to be approximately normal.
Solution: From Table AA we find that (95 = 1.711 for 24 dogrees of freedom. Therefore, the
‘engineer can be satisfied with his claim if a sample of 25 batches yields a t-value
‘between —1711 and 1.711. If y= 500, then
18 ~ 500
0/85
1 value well above 1.711. The probability of obtaining a évalue, with v = 24, equal
to or greater than 2.25 is approximately 0.02. If w > 500, the value of ¢ computed
from the sample is more reasonable. Hence, the engineer is likely to conclude that
the process produces a better product than he thought. 4
= 225,
‘What Is the tDistribution Used For?
‘The L-distribution is used extensively in problems that deal with inference about
‘the population mean (as illustrated in Example 8.11) or in problems that involve
comparative samples (i., in cases where one is trying to determine if means from
‘two samples are significantly different). ‘The use of the distribution will be extended,
im Chapters 9, 10, 11, and 12, The reader should note that use of the ¢-distribution
for the statistic
x
Sivn
roquires that Xy,X2,...,Xp be normal. The use of the (distribution and the
Theorem. ‘The use
sample size consideration do not relate to the Central Limit,
of the standard normal distribution rather than for n> 30 merely implies that
S is a sufficiently good estimator of & in this case. In chapters that follow the
Ledistribution finds extonsive usage.87 P-Distribution 251
8.7 F-Distribution
We have motivated the t-distribution in part by its application to problems in which
there is comparative sampling (i.e., a comparison between two sample means).
For example, some of our examples in future chapters will take a more formal
approach, chemical engineer collects data on two catalysts, biologist collects data
on two growth media, or chemist gathers data on two methods of coating material
to inhibit corrosion. While it is of interest to let sample information shed light
‘on two population means, it is often the ease that a comparison of variability is
‘equally important, if not more so. ‘The F-distribution finds enormous application
in comparing sample variances. Applications of the F-distribution are found in
problems involving two or more samples.
‘The statistic F is defined to be the ratio of two independent chi-squared random.
variables, each divided by its number of degrees of freedom. Hence, we can write
_ Ul
~ Vin"
U and V are independent random variables having chi-squared distributions
ith v1 and vz degrees of freedom, respectively. We shall now state the sampling
distribution of F.
Tet U and V be two independent random variables having chi-squared distributions}
with vy and v2 degrees of freedom, respectively. ‘Then the distribution of the
random variable F
‘Theorem 8.6:
7}e ia given by the density function
Nlestuyaie ya? por.
ap) = (RE — Terra I>
fo
This is known as the Fidistribution with vy and vy degrees of freedom (4.6).
We will make considerable use of the random variable F in future chapters. Ho-
‘ever, the density function will not be used and is given only for completeness. The
curve of the F-distribution depends not only on the two parameters v and vs but
also on the order in which we state them. Once these two values aze given, we can
identify the curve. Typical F-distributions are shown in Figure 8.11
Let fo be the f-value above which we find an arca equal to a. This is illustrated
by the shaded region in Figure 8.12. Table A.6 gives values of fa only for a = 0.05
and a = 0.01 for various combinations of the degrees of freedom and v2. Hence,
the f-value with 6 and 10 dogrecs of freedom, leaving an area of 0.05 to the right,
is foos = 3.22. By means of the folowing theorem, Table A.6 can also be used to
find values of foe and fo99. The proof islet for the reader