ES12010 Lecture 8 2023-24
ES12010 Lecture 8 2023-24
Lecture 8
Testing of Hypothesis
• Dr Imran Shah
• Lecturer in Economics
• Email: i.h.shah@bath.ac.uk
• Office: 3 East 4.27
Outline and ILOs (1-2)
Reference: Newbold Chapter 9
• Define Type I and Type II errors and assess the power of a test.
• Formulate null and alternative hypotheses for applications involving
• Complete a hypothesis test for the difference between two proportions (large
samples).
Rules Learned Thus Far
(H0 : μ =3).
– ( H 1: μ ≠ 3 )
Level of Significance :
• Defines the unlikely values of the sample statistic if the null hypothesis is true.
Type 1 and Type 2 errors are commonly associated with hypothesis testing in statistics.
• Type I Error
• Reject a true null hypothesis.
• Considered a serious type of error.
• The probability of Type I Error is α
• Called level of significance of the test.
• Set by researcher in advance.
• Type II Error
• Fail to reject a false null hypothesis.
• The probability of Type II Error is β.
Errors in Making Decisions (2-3)
1. Type I Error (False Positive):
• Example: In a medical test for a disease, a Type 1 error would occur if the test
incorrectly shows that a person has the disease when they actually do not.
• Result: The individual may undergo unnecessary treatment due to the false
diagnosis.
• The power of a test is the probability of rejecting a null hypothesis that is false, i.e.
it equals 1- β.
• The power of the test increases as the sample size increases [ the more observations
you have the easier it is to make a correct decision].
• For example, if the probability of finding a guilty person innocent is 0.15, the power
of the test is 0.85, we find them guilty 85% of the time.
Decision Rules
• H0 : μ ≤ μ0 ; H1: μ > μ0. ASSUME we know the population variance ().
• That is you reject the null if the sample mean is more standard deviations
away from the proposed population mean than is viewed as reasonable.
The P Value
• A critical concept in this is the p value.
• p-value: probability of obtaining a test statistic more extreme ( ≤ or ) than the
observed sample value, given H0 is true.
Tests of the Mean of a Normal Distribution Sigma Known
• Convert sample result ( x ) to a z value
Decision Rule
x 0
Reject H 0 if z
z
n
Alternate rule:
Reject H 0 if x 0 z
n
P-Value Approach to Testing
• p-value: Probability of obtaining a test statistic more extreme than the observed
sample value given is true.
• Also called observed level of significance.
• Smallest value of α for which can be rejected.
p-Value Approach to Testing
• Convert sample result (e.g. ) to test statistic (e.g., z statistic).
• Obtain the p-value.
• For an upper tail test:
• Decision rule: compare the p-value to α.
• If p-value α, reject H 0 x 0
p - value P( z , given thet H 0 is true)
• If p-value α, do not reject .
n
x 0
P( z 0 )
n
Example 1: Upper-Tail Z Test for Mean Sigma Known (1-3)
• A phone industry manager thinks that customer monthly cell phone bill have
increased, and now average over $52 per month. The company wishes to test this
claim. (Assume 10 is known)
• i.e.: there is not sufficient evidence that the mean bill is over $52
Example 1: Upper-Tail Z Test for Mean Sigma Known: P-value Solution (3-3)
P ( x 53.1 52.0)
53.1 52.0
P z
10
64
P( z 0.88) 1 .8106
.1894
H0: μ = 3 , H 1: μ ≠ 3
• Now we will reject if the sample mean is either too high or too low, i.e. we are
interested in both sides of the distribution.
• Choose =0.05. Assume population variance is known (σ = 0.8) and hence use the
Z test.
• Suppose the sample results are n = 100, = 2.84.
z= = = = -2.0
A Two Tailed Test – Example (2-2)
• Here, z = -2.0 < -1.96, so the test statistic is in the rejection region.
• We reject the null hypothesis and conclude that there is sufficient evidence that
the population mean is not equal to 3.
• Visually -2.0 lies in this part of the rejection region:
• Since hypothesis and conclude that there is sufficient evidence that the mean
number of T Vs in U S homes is not equal to 3
A Two Tailed Test - Example
The p-Value approach for z= = = -2.0
• How likely is it to see a sample mean of 2.84 (or something further from the mean,
in either direction) if the true mean is = 3.0?
• P-value= 0.0454.
Compare this to (1). AS ALWAYS very similar just switch s for σ and for .
• The average cost of a hotel room in Paris is said to be Euros 168 per night. A
• Because we do not know the population standard deviation we use s and the t
statistic.
= = 1.46
where n-1=24;
• Hence Do not reject H0: not sufficient evidence that true mean cost is different than
€168.
• Note just because the sample mean is different from the proposed population mean,
is not in itself proof that the proposed population mean is wrong.
• The sample is the sample and we would not expect the sample mean to be the same
as the population mean.
Tests of the Population Proportion
• Firstly remember in lecture 4 we had the following: has a binomial distribution, but
be at least: n=5/(P(1-P).
• OK we start by assuming that nP(1 – P) > 5 and hence the sampling distribution of
z= (4)
Tests of the Population Proportion – Example (1-2)
• A marketing company claims that it receives 8% responses from its mailing. To test
this claim, a random sample of 500 were surveyed with 25 responses. Test at the
= .05 significance level.
• Check: Our approximation for is
= 25/500 = .05
z= = = -2.47
• It’s a two tailed test so need to compare with ± = ± Z0.025 = ±1.96: (-2.47<-1.96).
Tests of the Population Proportion – Example (2-2)
Decision: As calculated value of Z falls in the rejection region. Hence, we reject null
hypothesis and conclude that There is sufficient evidence to reject the company’s claim
of 8% response rate.
• Calculate the p-value and compare to α. (For a two sided test the p-value is always
two sided).
p-value = .0136:
P( Z 2.47) P( Z 2.47)
2(.0068) 0.0136
Week 29
In general, dimensions for parts produced on the same machine are typically closer
Thus, matched pairs of observations are preferred when comparing measurements from
two populations, as they result in smaller variance. This smaller variance increases the
• Let and represent the observed sample mean and standard deviation for the
differences .
• If the population distribution of the differences is normal, the following tests are
d
Where t has n 1 d.f .
sd
n
Dependent Samples Example (1-4)
• Assume you send your salespeople to a “customer service” training workshop.
Has the training made a difference in the number of complaints? ( = 0.05). You
• Has the training made a difference in the number of complaints (at the = 0.05
level)? = -4.2; = 5.67; n=5.
H0: μx – μy = 0
H1: μx – μy 0
= = -1.66
• Assumptions:
– Samples are randomly and independently drawn.
– Both population distributions are normal.
– Population variances are known.
• When and [the population variances] are known the variance of - is + and the
corresponding Z variable is defined as:
Z= (2
Independent Samples: If and are Known (2-2)
Z=
Decision Rule
Lower-tail test:
Upper-tail test:
Independent Samples: If and are Known (3-3)
Decision Rule
Two-tail test:
(𝑥− 𝑦)
𝐻 0 : 𝜇 𝑥 – 𝜇 𝑦 + 0 ; 𝐻 1: 𝜇 𝑥 – 𝜇 𝑦 ≠ 0 ,𝑟𝑒𝑗𝑒𝑐𝑡 >Z𝛼
√
2 2
𝜎𝑥 𝜎 𝑦
2
+
𝑛𝑥 𝑛 𝑦
• For large sample sizes , replacing population variances with sample variances
provides a good approximation at significance level .
• Additionally, the CLT ensures accurate approximations even if populations are not
normally distributed.
Independent Samples: Population Variance is known – Example (1-2)
Shirley Brown, an agricultural economist, is comparing cow manure and turkey dung
as fertilizers for farmers.
She'll determine whether turkey dung, offered at a favourable price by a major turkey
farmer, increases productivity compared to cow manure.
• To begin the study, Shirley specified a hypothesis test with
• where is the population mean productivity using turkey dung and is the population
mean productivity using cow manure.
• indicates that turkey dung results in higher productivity.
Independent Samples: Population Variance is known – Example (2-2)
• This is greater than Z = 1.645; hence reject H0 and accept H1. Confirm that the for
this is 0.0094.
Tests of the Difference Between Two Means: and are Unknown But
Assumed Equal (1-3)
• Assumptions:
– Samples are randomly and independently drawn.
– Populations are normally distributed.
– Population variances are unknown but assumed equal.
• The population variances are assumed equal, so use the two sample standard
deviations and pool them to estimate .
• We use t-value with degrees of freedom.
• The t statistic for this is:
t=
Tests of the Difference Between Two Means: and are Unknown But
Assumed Equal (2-3)
In these tests, we assume an independent random sample of size and observations
drawn from normally distributed populations with means and and a common variance.
The sample variances and are used to compute a pooled variance estimator.
1. Then, using the observed sample means and , the following tests have significance
level :
(x y)
t
s 2p s 2p
nx ny
Tests of the Difference Between Two Means: and are Unknown But
Assumed Equal (3-3)
• Two Population Means, Independent Samples, Variances Unknown.
H 0 : x – y 0 H 0 : x y 0 H 0 : x y 0
H1 : x – y 0 H1 : x y 0 H1 : x y 0
Two Means: Unknown but Equal Variances Example (1-3)
• You are a financial analyst for a brokerage firm. Is there a difference in dividend
yield between stocks listed on the S&P500 & NASDAQ? You collect the following
data on yields:
• Assuming both populations are approximately normal with equal variances, is there
a difference in average yield ( = 0.05)?
Two Means: Unknown but Equal Variances Example (2-3)
H 0 : 1 2 0 i.e. ( 1 2 )
H 1 : 1 2 0 i.e. ( 1 2 )
( X1 X 2 ) (3.27 2.53)
t 2.040
1 1 1 1
S
2
1.5021
p
n1 n2 21 25
( n 1) S 2
( n 1) S 2
(21 1)1.30 2
(25 1)1.16 2
S p2 1 1 2 2
1.5021
(n1 1) (n2 1) (21 1) (25 1)
Two Means: Unknown but Equal Variances Example (3-3)
df 21 25 2 44
Critical Values: t 2.0154
Test Statistic:
3.27 2.53
t 2.040
1 1
1.5021
21 25
• Decision: As t-computed does fall in the rejection region . So, we reject at 5%.
• Conclusion: There is evidence of a difference in means.
Test Hypotheses for two Population Proportions (1-3)
Z=
has a standard normal distribution.
Test Hypotheses for two Population Proportions (2-3)
• We want to test the hypothesis that the population proportions and are equal.
as follows:
• Where
Test Hypotheses for two Population Proportions (3-3)
H 0 : Px Py 0 H 0 : Px Py 0 H 0 : Px Py 0
H1 : Px Py 0 H1 : Px Py 0 H1 : Px Py 0
• Is there a significant difference between the proportion of men and the proportion of
women who will vote Yes on Proposition A?
Men: ; Women:
Two Population Proportions Example (2-2)
= = =0.549
• The test statistic is (from (8)):
Z= = =-1.31
• The critical Z value corresponding to a 5% rejection region, with 2.5% in each tail
is 1.96. The Z value above is less than that in absolute terms hence can’t reject the
null.
• Conclusion: There is not significant evidence of a difference between men and
women in proportions who will vote yes.
Summary (1-2)