0% found this document useful (0 votes)
7 views19 pages

MZB127 Topic 10 Lecture Notes (Unannotated Version)

Chapter 10 discusses confidence intervals and hypothesis testing, focusing on estimating distribution parameters like the mean and proportion from sample data. It explains how to calculate confidence intervals using the sample mean and standard deviation, as well as the sample proportion, incorporating the Central Limit Theorem and the Student's t-distribution. The chapter includes examples to illustrate the application of these concepts in real-world scenarios.

Uploaded by

Jagath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views19 pages

MZB127 Topic 10 Lecture Notes (Unannotated Version)

Chapter 10 discusses confidence intervals and hypothesis testing, focusing on estimating distribution parameters like the mean and proportion from sample data. It explains how to calculate confidence intervals using the sample mean and standard deviation, as well as the sample proportion, incorporating the Central Limit Theorem and the Student's t-distribution. The chapter includes examples to illustrate the application of these concepts in real-world scenarios.

Uploaded by

Jagath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Chapter 10

Confidence Intervals and Hypothesis


Testing

Preface
Often, we are interested to draw conclusions about one or two features of a distribution, such as
the mean (µ) of a continuous variable or the probability (p) of a specified outcome for a discrete
variable. These parameters can be estimated using the corresponding sample estimates (X̄ or p̂)
but these estimates won’t be exactly correct, so a natural question we raised in Section 9.2.1
was:

How close might our sample statistic (data-based estimate) be to the “true” value
of the relevant distribution parameter?

We will finalise our discussion of this question in the present chapter by formalising the concept
of confidence intervals (Section 10.1).
A second question we raised in Section 9.2.1 was:

Is it reasonable to assume a particular feature in the distribution of a variable? (e.g.,


to assume a particular value for a distribution mean or a theoretical probability?)

In this chapter we will also address this second question via the concept of hypothesis testing
(Section 10.2).
For both questions, we must use the distribution of the estimates for the parameters we
are interested in, that is, the sampling distributions for either X or p̂. However, unlike the
calculations we considered in Sections 9.2.2 and 9.2.3, we may have no information available to
us about the true parameters µ, σ and/or p, so our approaches must change accordingly.

77
78 Chapter 10. Confidence Intervals and Hypothesis Testing

10.1 Confidence Intervals

10.1.1 Estimating the true mean µ from the sample mean X, sample
standard deviation s and sample size n
Consider a Normal variable X ∼ N(µ, σ 2 ), but now we don’t know what µ or σ is.
Recall that, if the Central Limit Theorem holds or if X is already Normal, the mean X of an
n-sized sample of X follows a Normal distribution X ∼ N (µ, σ 2 /n), and thus any obtained
sample mean x can be converted to a sample from the standard Normal distribution Z ∼ N(0, 1)
via

x − µX x−µ
z= = √ .
σX σ/ n

If we have a sample mean x from a sample of size n, we should also be able to calculate a sample
standard deviation s. If we don’t know what the true standard deviation σ is, it can be shown
(not here!) that we can actually replace σ by s in the above equation if we use the Student’s
t-distribution with n − 1 degrees of freedom instead of the standard Normal distribution:
x−µ
t= √ , d = n − 1.
s/ n

The above equation is a key application of the Student’s t-distribution (that we introduced
last week); this equation will also allow us to calculate a confidence interval for the true mean
µ, as we will see shortly. A confidence interval is a range of plausible values for the quantity
that we are trying to estimate. When talking about confidence intervals, we define a confidence
level (e.g. 90%, 95%, 99%) as the estimated probability that the true value of the quantity falls
within the range of plausible values that we specify. For example, we may wish to use our data
(x, s and n) to estimate, with 95% confidence, a range of plausible values for µ.
We usually use (1 − α) to denote the confidence level. For example, if α = 0.01, we are interested
in a 99% confidence interval; if α = 0.05, we are interested in a 95% confidence interval; etc. If
we use a to denote the maximum difference between x and µ that defines the lower and upper
bounds of our confidence interval, we can therefore write:

Pr(−a < x − µ < a) = 1 − α.

Thus, if we know a, then we know that there is a (1 − α)% probability that the difference
between the sample mean and true mean, x − µ, is between −a and a. Equivalently, we can say
that the (1 − α)% confidence interval for µ is between x − a and x + a, or more simply, this
confidence interval is x ± a. We must now try to work out what a is!
To do this, we assume that it is equally likely (or unlikely!) that x − µ could exceed a versus
the possibility that x − µ could be less than −a. This assumption, combined with the above
equation, gives us that:

α
Pr(x − µ > a) = Pr(x − µ < −a) = .
2

If we then focus on the first probability expression√in the equation above, and divide the
inequality inside the brackets of Pr(x − µ > a) by s/ n, we obtain:
10.1. Confidence Intervals 79

 
x−µ a α
Pr √ > √ = .
s/ n s/ n 2

We note that the term on the left-side of the inequality is described by a t-distribution with
degrees of freedom d = n − 1. Noting this, and with a slight abuse of notation, we end up with:
 
a α
Pr T > √ = , d = n − 1.
s/ n 2

If we now denote td,p as the value of the t-distribution for d degrees of freedom that satisfies
Pr(T > td,p ) = p, the above equation gives us that:

a
tn−1,α/2 = √ , or equivalently,
s/ n

s
a = tn−1,α/2 √ .
n

To obtain tn−1,α/2 , we can use Fawcett and Kent Table 6, or in Excel we can use the command
=T.INV(1-α/2,n-1) (see Section 9.1.2 for further details). The next few examples demonstrate
how to apply the equations above to calculate confidence intervals for the true mean µ, from a
sample mean x and sample standard deviation s obtained from a sample of size n.

Examples
10.1.1.1 Live example: output voltage of a computer power supply revisited

In Example 9.2.2.1, we considered the output voltage of a computer power supply described
by the variable X with X ∼ N (6.5, 0.022 ), that is, µ = 6.5 and σ = 0.02.
Now, suppose that the true mean µ and true standard deviation σ are both unknown, but
a random sample of 25 observations gave a sample mean of x = 6.505 V and a sample
standard deviation s = 0.022 V. Find the value a such that Pr(−a < x̄ − µ < a) = 0.95.
Can you also explain what this value of a tells us?
80 Chapter 10. Confidence Intervals and Hypothesis Testing

10.1.1.2 Live Example: MZB127 students

A sample of MZB127 students are randomly selected and have their heights measured.
The measurements that were obtained are tabulated below.

Height (cm) 171 184 177 178 165 190 179 183

Compute a 95% confidence interval for the average height of an MZB127 student.
10.1. Confidence Intervals 81

10.1.1.3 Live Example: Rainwater acidity

A group of environmental chemists is interested in the acidity of rainwater, which is


measured by a quantity known as the pH value of the water. A sample of 20 rainfalls in a
particular area gave a sample mean pH value of 5.3 and a sample standard deviation of
0.64.

(a) Use this information to provide a 99% confidence interval estimate for the true mean
µ of rainwater pH values in this area.

(b) Provide a 98% confidence interval estimate for the true mean µ of rainwater pH
values in this area.
Denoting the pH value of the rainwater samples as x, we have x̄ = 5.3, s = 0.64 and
n = 20. For a confidence level of 1 − α = 0.98, we have α/2 = 0.01, so we look up
t19,0.01 on Fawcett and Kent Table 6 and find that t19,0.01 = 2.539. The desired 98%
confidence interval for µ is then given by
s 0.64
x̄ ± a = x̄ ± tn−1,α/2 √ = 5.3 ± 2.539 × √ ≈ 5.3 ± 0.36 = 4.94 to 5.66.
n 20

The first question we raised in Section 9.2.1 and reiterated at the start of the chapter, “How
close might our sample statistic (data-based estimate) be to the “true” value of the relevant
distribution parameter?”, has now been answered for the case where the sample statistic (data-
based estimate) is the sample mean X! We next tackle the case where our sample statistic
(data-based estimate) is the sample proportion p̂.
82 Chapter 10. Confidence Intervals and Hypothesis Testing

10.1.2 Estimating the true proportion p from the sample proportion


p̂ and sample size n
Consider a binomial variable X ∼ bin(n, p), but now we don’t know what p is.
Recall that, if the Central Limit Theorem holds, the binomial distribution X ∼ bin(n, p) can
be approximated by a Normal distribution for p̂ = X/n via p̂ ∼ N(µp̂ , σp̂2 ), where µp̂ = p and
p
σp̂ = p(1 − p)/n. We could therefore convert values p̂ from its Normal distribution to values
z from a standard Normal distribution Z ∼ N(0, 1) via the formula:

p̂ − µp̂ p̂ − p
z= =p .
σp̂ p(1 − p)/n

Note that the conversion above is straightforward if we know the true proportion p, but rather
complicated if we don’t know p! The trick we employ here is to approximate, on the denominator,
p by p̂,

p̂ − p
z≈p ,
p̂(1 − p̂)/n

as this makes latter computations more straightforward. (Note that this means we have now
introduced two approximations: (1) approximating the binomial distribution by a Normal
distribution, and (2) approximating p by p̂. Alas, this is sometimes required in mathematics or
statistics to obtain an answer without getting stuck!)
If we only have a sample estimate, i.e. we only know values p̂ and n, a confidence interval for
the true proportion p can be written as

Pr(−a < p̂ − p < a) = 1 − α,

where, like in Section 10.1.1, (1 − α) is the confidence level, and a is the value that denotes the
maximum difference between p̂ and p to define the lower and upper bounds of our confidence
interval. We now must find a suitable value for a.
Similarly to Section 10.1.1, we assume it is equally likely (or unlikely!) that p̂ − p could exceed
a versus the possibility that p̂ − p could be less than −a. This assumption, combined with the
above equation, gives us that:
α
Pr(p̂ − p > a) = Pr(p̂ − p < −a) = .
2

If we then focus on the first probability expression


p in the equation above, and divide the
inequality inside the brackets of Pr(p̂ − p > a) by p̂(1 − p̂)/n, we obtain:
!
p̂ − p a α
Pr p >p = .
p̂(1 − p̂)/n p̂(1 − p̂)/n 2

We note that the term on the left-side of the inequality is described by a standard Normal
distribution. Noting this, and with a slight abuse of notation, we end up with:
!
a α
Pr Z > p = .
p̂(1 − p̂)/n 2
10.1. Confidence Intervals 83

If we now denote zp as the value of the standard Normal distribution that satisfies Pr(Z > zp ) = p,
the above equation gives us that:

a
zα/2 = p , or equivalently,
p̂(1 − p̂)/n

p
a = zα/2 p̂(1 − p̂)/n.

To obtain zα/2 , we can use Fawcett and Kent Table 4, or in Excel we can use the command
=NORM.S.INV(1-α/2) (see Section 9.1.1 for further details). The next few examples demonstrate
how to apply the equations above to calculate confidence intervals for the true proportion p,
from a sample proportion p̂ obtained from a sample of size n.

Examples
10.1.2.1 Live example: fair 12-sided die revisited

In Example 9.2.3.1, we considered a game involving a 12-sided die with faces numbered
from 1 to 12.
Suppose that we no longer assume the die is fair but instead are trying to estimate the
true proportion p of 10s for this die using a sample of 150 throws and that our estimate
from this sample is p̂ = 0.08. Find a value a such that Pr(−a < p̂ − p < a) ≈ 0.9. Can
you also explain what this value of a tells us?
84 Chapter 10. Confidence Intervals and Hypothesis Testing

10.1.2.2 Live example: Cyclists on a bike track

Consider that 945 people were observed on a bike track, including 625 cyclists.

(a) Estimate the true proportion p of cyclists among the users of this bike track.

(b) Determine the value a such that we can have 90% confidence that a sample proportion
from a sample of this size will lie within a of the true proportion p.
10.1. Confidence Intervals 85

(c) What is the 95% confidence interval for the true proportion p?
We have X = 625, n = 945 and hence p̂ = 625/945 ≈ 0.6614. The confidence
interval 1 − α = 0.95, so zα/2 = z0.025 = 1.96 (from Fawcett and Kent Table 4). We
can then calculate a as
p p
a = zα/2 p̂(1 − p̂)/n = 1.96 0.6614 × 0.3386/945 ≈ 0.03017.

Thus the 95% confidence interval for the true proportion of cyclists is (approximately)
p = 0.66 ± 0.03, or between 0.63 and 0.69.

10.1.2.3 Live Example: Proportion of defective batteries

Engineers at a battery plant conducted a series of experiments to investigate the propor-


tions of cells (batteries) that were being scrapped due to internal short circuits. At one
stage in the investigation, a sample of 235 cells produced 9 cells with short circuits.

(a) Use this information to provide a 95% confidence interval for the true proportion p
of defective batteries for this plant.
86 Chapter 10. Confidence Intervals and Hypothesis Testing

(b) Estimate the true proportion of defective batteries under these production conditions
using a 90% confidence interval.
In this case, we have X = 9, n = 235 and so p̂ = 9/235 ≈ 0.0383. For a confidence
level of 1 − α = 0.90, we have α/2 = 0.05, so we look up z0.05 on Fawcett and Kent
Table 4 and find that z0.05 = 1.64485. The desired 90% confidence interval for p is
then given by
r
p 0.0383 × 0.9617
p̂ ± a = p̂ ± zα/2 p̂(1 − p̂)/n = 0.0383 ± 1.64485
235
≈ 0.0383 ± 0.0206 = 0.0177 to 0.0589.

So a 90% interval estimate (confidence interval) for p based on this sample is


(0.0177, 0.0589). (Note that this result may be slightly approximate, since we only
have X = 9 so is rather close to our guideline for “p not too extreme” required for
the Central Limit Theorem.)

The first question we raised in Section 9.2.1 and reiterated at the start of the chapter, “How
close might our sample statistic (data-based estimate) be to the “true” value of the relevant
distribution parameter?”, has now been answered for the case where the sample statistic
(data-based estimate) is the sample proportion p̂.

10.2 Hypothesis Tests


We now consider the second of our questions: whether is it reasonable to assume a particular
feature (e.g., a particular value for a mean µ or probability p) in the distribution of a variable.
In other words, we wish to assess whether the data we have are consistent with the assumed
feature. This question is answered by using a formal statistical test (also known as a hypothesis
test). Figure 10.1 below shows one way of summarising the process we use in hypothesis testing.

Figure 10.1: Diagram of the hypothesis testing process.

The essential concept in all of this is that we compare the assumption against the sample
evidence by calculating a probability of the form

Pr(observing the sample we observed assuming that the hypothesis (assumption) is true).

The smaller this probability, the less we are inclined to accept the assumption as a reasonable
description (model) for the observations in our sample. So a small probability provides evidence
against the assumption, whereas a “not small” probability doesn’t. It is also worth noting that
10.2. Hypothesis Tests 87

a statistical test cannot prove that an assumption is false (nor that it is true); it can only tell
you how strong (or weak) the evidence of the sample is for rejecting the assumption.
The assumption that is being tested is usually known as the null hypothesis, denoted as H0 for
short; this is the assumption that will be accepted (for the moment) unless the evidence from
the sample is strong enough to reject it. In a formal statistical test, this assumption should be
stated at the start of the test. The probability that is calculated (based on the assumption and
the sample) is usually known as the p-value (short for probability). Clearly, the assumption that
is to be tested will vary from one situation to another. The type of assumption to be tested will
also vary between situations, and this will change the way we calculate the p-value from the
data and the assumption. However, the process for drawing the conclusion (using the p-value to
assess the strength of evidence against the assumption) is the same for all formal statistical
tests.

10.2.1 Interpreting p-values


We mentioned above that smaller p-values indicate stronger evidence against the assumption
being tested. Evidence against an assumption is usually not considered strong enough to reject
that assumption unless the p-value is around 5% (0.05) or less. This second comment is merely
an observation which formalises people’s general intuition in this area; there is no hard and fast
rule about how low a p-value can or should be before an assumption is rejected. In practice,
the level of evidence required for such a decision will often depend on the consequences of
the decision: before making a decision that results in high costs (e.g., retooling a factory or
instituting some remedial program), the decision makers will usually want very strong evidence
that this is required (perhaps a p-value below 1%, or even lower), while in other cases a moderate
indication (perhaps a p-value of 5%) would be appropriate. For these reasons, it is usually best
to quote the actual p-value of a test and interpret it in general terms such as the following:

ˆ a p-value over 10% doesn’t provide enough evidence to seriously doubt H0

ˆ a p-value between 5% and 10% provides slight evidence against H0

ˆ a p-value around 5% (say, 4-6%) provides some (moderate) evidence against H0

ˆ a p-value around 2% (say, 1-3%) provides (quite) strong evidence against H0

ˆ a p-value below 1% provides very strong evidence against H0

In some fields of study, fixed values (typically 5% or 1%) are routinely used as “cut-offs” for
deciding whether to reject an assumption and the result is just reported as “H0 rejected” or
“H0 accepted” (interpreted into the relevant context). This is not an ideal practice, as it makes
an artificial distinction between similar p-values such as 0.049 and 0.051 and gives no indication
of how strong (or weak) the evidence against H0 actually is. The best approach, as indicated
above, is to quote the actual p-value and to interpret the corresponding strength of evidence
against H0 , stating the conclusion to be drawn from this in the original context for the test.

10.2.2 Test of a proposed mean µ


The broad concepts of statistical testing can be applied to testing assumptions about the
theoretical mean of a numerical variable (usually a continuous variable). This can be useful
in virtually any context where numerical information is being collected and compared against
88 Chapter 10. Confidence Intervals and Hypothesis Testing

an accepted standard: for example, quality control of component dimensions; monitoring of


voltages, contaminant levels, lighting levels, dispensed volumes, etc.; assessing relevant properties
of newly developed products or processes against existing standards; and so on.
In Section 10.1.1, we noted that
x̄ − µ
t= √ , d = n − 1.
s/ n

This allows us to make probability statements about the sample mean x̄, assuming that µ is
known. If we have an assumption (H0 ) about µ, we can therefore use an appropriately large
sample of data together with the formula above to obtain evidence regarding the assumption;
that is, to give us a p-value for testing the assumption.
Note: These tests rely on the validity of the sampling distribution for X̄ described by the
equation above, which requires in turn that the variable is Normally distributed or that n ≥ 30.
This should be checked (at least roughly) before proceeding with such a test.
It is also useful to specify the assumption that will be accepted if H0 is rejected. This is known
as the alternative hypothesis and is usually denoted as either H1 (or, sometimes, HA ). The
alternative hypothesis H1 tells us how to find the p-value. For example:

ˆ In testing whether a pollution threshold k has been exceeded, we would test H0 : µ = k


versus H1 : µ > k.

ˆ In testing whether there is bias in a measured quantity (with true value M ), we test
H0 : µ = M vs H1 : µ ̸= M .

ˆ In testing whether a coin is biased, we would test H0 : p = 0.5 against H1 : p ̸= 0.5.

ˆ In testing whether process changes have reduced the proportion of defectives below the
previous rate d, we test H0 : p = d against H1 : p < d.

When H1 includes < or >, it’s called a one-sided alternative and the test is often known as a
one-sided test. Likewise, when H1 includes ̸=, we have a two-sided alternative and a two-sided
test. The choice of alternative hypothesis only affects how we find the p-value from the table, as
follows:

ˆ For a one-sided test, our p-value is either Pr(T < t) or Pr(T > t), so the p-value is simply
equal to the relevant one out of these two probabilities.

ˆ For a two-sided test, our p-value will be Pr(T < |−t|) + Pr(T > |t|) = 2 × Pr(T > |t|), so
the p-value is double the value of Pr(T > |t|).
10.2. Hypothesis Tests 89

Examples
10.2.2.1 Live Example: MZB127 students

A sample of MZB127 students are randomly selected and have their heights measured.
The measurements that were obtained are tabulated below.

Height (cm) 171 184 177 178 165 190 179 183

(a) Is there evidence to reject the hypothesis/assumption that the average height of an
MZB127 student is 184 cm?
90 Chapter 10. Confidence Intervals and Hypothesis Testing

(b) Is there evidence to suggest that the average height is less than 184 cm?

10.2.2.2 Rainwater acidity revisited

In Example 10.1.1.3, a group of environmental chemists investigated the acidity of


rainwater, as measured by its pH value (where lower pH values indicate greater acidity).
If pure rain falling through clean air registers a pH value of 5.7, does the sample they
collected (20 observations, with a sample mean pH value of 5.3 and a sample standard
deviation of 0.64 ) provide evidence that the rain in that area is more acidic than it should
ideally be?
Denote the pH value of rainfalls in this area as X and its distribution mean as µ, and
assume that X is Normally distributed. We wish to test H0 : µ = 5.7 vs H1 : µ < 5.7.
Our sample gave us x̄ = 5.3 and we want to find how likely we were to observe a value
this low if H0 is correct; that is, we want to find Pr X̄ ≤ 5.3 given that µ = 5.7.
Our sample gives n = 20 and s = 0.64, so
 
 5.3 − 5.7
Pr X̄ ≤ 5.3 = Pr T ≤ √
0.64/ 20
= Pr(T ≤ −2.80)
= Pr(T ≥ 2.80) [by symmetry of the t-distribution]

We can now look up the value 2.80 on the t-distribution with n − 1 = 19 degrees of
freedom and we find that Pr(T ≥ 2.80) ≈ 0.00571. (The table we use is Table 5 in the
10.2. Hypothesis Tests 91

tables that have posted on Canvas. As this is a one-sided test, Pr(T > 2.80) = 0.00571 is
the p-value for the test.
This is a very small probability, so there is very strong evidence against H0 and we have
very strong evidence to conclude that the true mean pH of rainfalls in this area is more
acidic than the standard.

10.2.2.3 Latent heat of fusion

[from “Experimental Statistics” (NBS Handbook 91) by M. G. Natrella]


Measurements of the latent heat of fusion for ice were made using two different methods.
The first method was used on 13 specimens, yielding a sample average of 80.021 cal/g
and a sample standard deviation of 0.024 cal/g. Given that the quoted value for this
measurement is 80 cal/g, does this sample provide evidence of a bias in this measurement
method?
Denote the latent heat measurements from this method as X. As in the previous example,
we need to assume that these measurements are Normally distributed. Now if the method
is unbiased, the true mean of these measurements should equal the quoted value of
80 cal/g. We are therefore being asked to test H0 : µ = 80 vs H1 : µ ̸= 80.
From the sample, we have x̄ = 80.021, s = 0.024 and n = 13. Evaluating the test statistic,
we have
x̄ − µ 80.021 − 80
t= √ = √ ≈ 3.13.
s/ n 0.024/ 13

We look up this test statistic on the t-distribution tables, using n − 1 = 12 degrees of


freedom, and find that Pr(T > 3.13) is less than 0.00459. Since this is a two-sided test,
we double Pr(T > 3.13) to find the p-value, so we have a p-value that is < 2 × 0.00459 =
0.00918.
This is a low probability, so there is strong evidence to reject H0 and to conclude instead
that there is indeed a bias in this measurement method.

10.2.3 Test of a proposed proportion p


There are a number of situations in practice where we might be interested in testing whether a
sample from a binomial situation is consistent with an assumed value for the true proportion
(probability) of “successes”. Some simple examples include quality control situations (Has the
proportion of defective items from the production line increased? Is it within acceptable limits?)
and opinion polls (Does a majority (over 50%) of the population support a particular party or
policy? Is the party or policy supported by more than (say) 65% of the population?)
With today’s computing power, tests of this sort can easily be carried out using the exact
binomial distribution. For calculations by hand, however, the sample size in most of these
situations is far too large for us to do this easily, so we use an approximate distribution to find
the p-value. As discussed previously, provided that n is reasonably large (n ≥ 30) and p is not
too extreme (X ≈ np ≥ 5 and n − X ≈ n(1 − p) ≥ 5), the Normal distribution provides a good
approximation to the binomial distribution:
92 Chapter 10. Confidence Intervals and Hypothesis Testing

ˆ if X ∼ bin(n, p), then (approximately) X ∼ N(np, np(1 − p))

ˆ and if p̂ = X/n, then (approximately) p̂ ∼ N(p, p(1 − p)/n).

From Section 10.1.2, we noted that values of p̂ from its approximate Normal distribution can be
converted to values z from the standard Normal distribution via:
p̂ − p
z=p .
p(1 − p)/n
We can then use the corresponding Normal probabilities associated with this z-value to carry
out a hypothesis test on a proposed proportion p.
Note that H0 gives us a value of p for this calculation, so there is no need for any further
approximation. Once again, H1 can be either one-sided or two-sided, with

ˆ For a one-sided test, the p-value equals Pr(Z < z) or Pr(Z > z), so the p-value is equal to
the relevant one out of these two probabilities.
ˆ For a two-sided test, the p-value equal to Pr(Z < |−z|) + Pr(Z > |z|) = 2 × Pr(Z > |z|),
so the p-value is double the value of Pr(Z > |z|).

Examples
10.2.3.1 Live Example: Bike track users revisited

An investigation into the usage of a bike track observed 945 people using the track, of
whom 625 were cyclists. Historical data indicates that the proportion of users on this
track who are cyclists is 65%.

(a) Is there any evidence from this sample to suggest that this proportion has changed
since then?
10.2. Hypothesis Tests 93

(b) Is there any evidence from this sample to suggest that this proportion has increased?
94 Chapter 10. Confidence Intervals and Hypothesis Testing

10.2.3.2 Proportion of defective batteries revisited

In Example 10.1.2.3, electrical cells (batteries) from a battery plant were tested and we
considered the proportion of defective cells being produced under the current operating
conditions. Suppose that the company had previously quoted the proportion of defective
cells at 6%. Having put procedures in place to improve their manufacturing process, they
want to know whether the data from the latest sample (9 defective cells in a sample of
235) provide evidence of a decrease in the proportion of defective cells.
If we denote the proportion of defective cells as p, we are testing H0 : p = 0.06. Since we are
interested in whether there is evidence of a decrease in p, our alternative is H1 : p < 0.06.
Our sample gives a sample proportion of p̂ = 9/235 ≈ 0.0383. The probability of observing
a sample at least this unusual (if H0 is true) can be calculated approximately as:
!
0.0383 − 0.06
Pr(p̂ ≤ 0.0383) = Pr Z ≤ p
0.06 × 0.94/235
= Pr(Z ≤ −1.42)
= Pr(Z > 1.42)
≈ 0.0778 [ from Table 3 of the probability tables]

This is the p-value for our test, so there is only very slight evidence against H0 ; that is,
there is only slight evidence that the true proportion of defectives has decreased.

10.2.3.3 Fair coin

A fair coin is assumed to have an equal chance of producing either a head or a tail.
Suppose we throw a fair coin 200 times and observe 110 heads (and hence 90 tails). Does
this sample provide evidence that the coin is biased?
Let p be the proportion of heads from tossing this coin. Since we would be equally
̸ 0.5 in
concerned by a change in either direction, we are testing H0 : p = 0.5 vs H1 : p =
this case.
Our sample gives a sample proportion of p̂ = 110/200 = 0.55. The test statistic for this
test is
p̂ − p 0.55 − 0.5
z=p =p ≈ 1.41.
p(1 − p)/n 0.5 × 0.5/200

Now, since our alternative hypothesis here is that p ̸= 0.5, we are actually looking to
see whether p̂ is too different from 0.5 on either side, in which case 90 heads (or fewer)
counts as being just as unusual as 110 heads (or more). It follows therefore that our
p-value in this situation (namely, Pr(observing a sample at least as unusual as this one
assuming H0 is true)) is not just Pr(Z > 1.41) but rather Pr(Z < −1.41)+Pr(Z > 1.41) =
2 × Pr(Z > 1.41).
From Table 3, Pr(Z > 1.41) ≈ 0.07927, so our p-value is therefore ≈ 2 × 0.07927 = 0.15854.
This is not a low p-value, so there is not enough evidence to reject H0 and we conclude
that we do not have evidence (from this sample) that the coin is biased.
10.2. Hypothesis Tests 95

10.2.4 Hypothesis testing summary


To conduct a hypothesis test (for an assumption about µ or p):

1. State the hypotheses H0 and H1 .

2. Obtain the relevant sample statistics (x̄, s, n or p̂, n).


 
x̄−µ √ p̂−p
3. Evaluate the test statistic t = s/ n or z =
√ .
p(1−p)/n

4. Look up the calculated value of the test statistic on the relevant table (t-distribution with
d = n − 1 degrees of freedom, or standard Normal distribution) to find the p-value (paying
attention also to H1 ).

5. State the strength of evidence (if any) against H0 , based on the p-value.

6. Interpret this result into the context of the statistical investigation.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy