Module3 Part5 Two Sample T Procedure
Module3 Part5 Two Sample T Procedure
Two-sample Problems
q Compare the means of some quantitative variable for two
populations, Population 1 and Population 2
q Compare the means of some response or outcome variable
under two treatments, Treatment 1 and Treatment 2
q Parameters: µ1 and µ2
§ Population mean for Population 1 and Population 2
§ Mean response for Treatment 1 and Treatment 2
Two-sample Data
q Draw separate SRSs of size 𝑛Q and 𝑛[ from Normal
population N(µ1, σ1) and N(µ2, σ2) respectively.
q An SRS of 𝑛Q + 𝑛[ is randomly assigned to two treatment
groups. The response or outcome variable measured on all
the possible experiment units in the two treatment groups
follow N(µ1, σ1) and N(µ2, σ2) respectively.
q Random; Normal; Independence
Two-sample Data
Population
Sample
or Parameter Sample Size Sample SD
Mean
Treatment
1 N(µ1, σ1) 𝑛Q 𝑋#Q 𝑆Q
2 N(µ2, σ2) 𝑛[ 𝑋#[ 𝑆[
Two-Sample vs Matched Pairs
Example: Does the job satisfaction of assembly-line workers differ when their work is
machine-paced rather than self-paced?
Study 1: Choose an SRS of 18 workers, and each worker’s satisfaction was assessed
after working in each setting.
This is a matched pairs problem.
Study 2: Choose an SRS of 36 workers. Half of these works (18 workers) are allowed
to pace themselves while the other half (18 workers) use an assembly line that moved
at a fixed pace.
This is a two-sample problem.
Matched Pairs t Procedures
q Study designs that involve making two observations on
the same individual, or one observation on each of two
similar individuals, result in paired data.
q For pared data, we can make comparisons by analyzing
the differences in each pair.
q If the conditions for inference are met, we can use one-
sample t procedures to perform inference about the mean
difference µd.
Example
Insurance adjusters are concerned about the high estimates
they are receiving for auto repairs from garage I compared to
garage II. To verify their suspicion, each of 15 cars recently
involved in an accident was taken to both garages for
separate estimates of repair costs.
Example
22
20
18
16
14
12
Garage 1 Garage 2
Hypothese
q Insurance adjusters are concerned about the high
estimates they are receiving for auto repairs from garage I
compared to garage II.
q Let 𝜇^ denote the difference of the average estimate from
garage 1 minus the average estimate from garage 2
q We want to perform a test at the α = 0.05 significance
level of H0: 𝜇^ = 0 versus Ha: 𝜇^ > 0
q Take the difference: d = garage 1 – garage 2
One-sample t test on the difference
● ● ●
1.0
1.0
1.2
● ● ●
0.8
0.8
●
1.0
0.6
0.6
Sample Quantiles
●
0.8
Density
0.4
0.4
0.6
● ● ●
0.2
0.2
0.4
0.0
0.0
0.2
−0.2
−0.2
0.0
@df @ff
q Standard deviation of 𝑋#Q − 𝑋#[ is +
Ad Af
Udf Uff
q Standard error of 𝑋#Q − 𝑋#[ is Ad
+ Af
Two-sample t statistic
q Parameter of interest: µ1 - µ2
q Two-sample t statistic
𝑋#Q − 𝑋#[ − 𝜇Q − 𝜇[
𝑇=
𝑠Q[ 𝑠[[
𝑛Q + 𝑛[
q Approximately a t distribution
q DF: use a conservative approach, using the smaller of n1 – 1 and n2 –
1 for the degrees of freedom.
Confidence Interval for µ1 - µ2
When the Random, Norma, and Independence conditions are
met, a level C confidence interval for µ1 - µ2 is
𝑠Q[ 𝑠[[
𝑋#Q − 𝑋#[ ± 𝑡 ∗ +
𝑛Q 𝑛[
𝑠Q[ 𝑠[[
𝑋#Q − 𝑋#[ ± 𝑡 ∗ +
𝑛Q 𝑛[
q t* is obtained from the t distribution with df = 30-1 = 29
Example
q A 90% confidence interval for
µ1 - µ2
𝑠Q[ 𝑠[[
𝑋#Q − 𝑋#[ ± 𝑡 ∗ +
𝑛Q 𝑛[
q t* = 1.699
Example
q A 90% confidence interval for µ1 - µ2
𝑋#Q − 𝑋#[
𝑇=
𝑠Q[ 𝑠[[
+
𝑛Q 𝑛[
(𝑋#Q −𝑋#[ ) − 𝑑C
𝑇=
𝑠Q[ 𝑠[[
+
𝑛Q 𝑛[
● ●
5
10
10
●
●
5
0
●
●
5
0
● ●
Sample Quantiles
Sample Quantiles
−5
response
● ● ●
0
−5
−10
●
●
●
−10
−5
●
−15
−15
−10
● ●
calcium placebo −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
q Test statistic
[ [
𝑛Q − 1 𝑠Q + 𝑛[ − 1 𝑠[
𝑠j[ =
𝑛Q + 𝑛[ − 2
Pooled two-sample procedure
q A level C confidence interval for µ1 − µ2 is
Q Q
𝑋#Q − 𝑋#[ ± 𝑡 ∗ 𝑠j +
Ad Af
𝑋#Q − 𝑋#[
𝑇=
1 1
𝑠j +
𝑛Q 𝑛[
§ The P-value is calculated under a 𝑡Ad oAf G[ curve
Test H0: σ1=σ2
q Suppose both populations are 𝑁 𝜇Q, 𝜎Q and 𝑁 𝜇[, 𝜎[ . Draw two
independent SRSs of sizes n1 and n2
Udf
q F statistic: F = JUf
f