0% found this document useful (0 votes)
13 views51 pages

23 +Lecture23+MAT361+ (08+APRIL+2025)

The document outlines the process of hypothesis testing, including the formulation of null and alternative hypotheses, the calculation of test statistics, and the interpretation of p-values. It discusses the significance of Type I and Type II errors, along with their probabilities, and emphasizes the importance of sample size in controlling these errors. Additionally, it provides illustrative examples, including drug efficacy tests, and concludes with methods for calculating confidence intervals and power of tests.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views51 pages

23 +Lecture23+MAT361+ (08+APRIL+2025)

The document outlines the process of hypothesis testing, including the formulation of null and alternative hypotheses, the calculation of test statistics, and the interpretation of p-values. It discusses the significance of Type I and Type II errors, along with their probabilities, and emphasizes the importance of sample size in controlling these errors. Additionally, it provides illustrative examples, including drug efficacy tests, and concludes with methods for calculating confidence intervals and power of tests.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Probability and Statistics

MAT 361
Lecture 23: Test of hypothesis

Dr. Md. Alamin


Assistant Professor, DMP
North South University
Email: md.alamin06@northsouth.edu
1
06 April 2025 md.alamin06@northsouth.edu

Dr. Md. Alamin, Assistant Professor, DPM, NSU. Email: md.alamin06@northsouth.edu


Hypothesis Testing

• Is also called significance testing


• Tests a claim about a parameter using evidence (data
in a sample
• The technique is introduced by considering a one-
sample z test
• The procedure is broken into four steps
• Each element of the procedure must be understood
Hypothesis Testing Steps
A. Null and alternative hypotheses
B. Test statistic
C. P-value and interpretation
D. Significance level (optional)
Null and Alternative Hypotheses
• Convert the research question to null and alternative
hypotheses
• The null hypothesis (H0) is a claim of “no difference in
the population”
• The alternative hypothesis (Ha) claims “H0 is false”
• Collect data and seek evidence against H0 as a way of
bolstering Ha (deduction)
Illustrative Example: “Body Weight”
• The problem: In the 1970s, 20–29 year old men in the
U.S. had a mean μ body weight of 170 pounds.
Standard deviation σ was 40 pounds. We test whether
mean body weight in the population now differs.
• Null hypothesis H0: μ = 170 (“no difference”)
• The alternative hypothesis can be either Ha: μ > 170
(one-sided test) or
Ha: μ ≠ 170 (two-sided test)
Test Statistic
This is an example of a one-sample test of a
mean when σ is known. Use this statistic to
test the problem:
x − 0
z stat =
SE x
where  0  population mean assuming H 0 is true

and SE x =
n
Illustrative Example: z statistic
• For the illustrative example, μ0 = 170
• We know σ = 40
• Take an SRS of n = 64. Therefore

 40
SE x = = =5
n 64
• If we found a sample mean of 173, then

x − 0 173 − 170
z stat = = = 0 .60
SE x 5
Illustrative Example: z statistic
If we found a sample mean of 185, then

x − 0 185 − 170
z stat = = = 3 .00
SE x 5
Example - Efficacy Test for New drug
• Drug company has new drug, wishes to compare it
with current standard treatment
• Federal regulators tell company that they must
demonstrate that new drug is better than current
treatment to receive approval
• Firm runs clinical trial where some patients receive
new drug, and others receive standard treatment
• Numeric response of therapeutic effect is obtained
(higher scores are better).
• Parameter of interest: New - Std
Example - Efficacy Test for New drug
• Null hypothesis - New drug is no better than standard trt

H 0 :  New −  Std  0 ( New −  Std = 0)

• Alternative hypothesis - New drug is better than standard trt

H A :  New −  Std  0
• Experimental (Sample) data:

y New y Std
s New s Std
n New n Std
Sampling Distribution of Difference in Means

• In large samples, the difference in two sample means is


approximately normally distributed:

  12  22 
Y 1 − Y 2 ~ N  1 −  2 , +
 n1 n2 
 

• Under the null hypothesis, 1-2=0 and:

Y1 −Y 2
Z = ~ N ( 0 ,1)
 2
 2
1
+ 2

n1 n2

• 12 and 22 are unknown and estimated by s12 and s22
Example - Efficacy Test for New drug

• Type I error - Concluding that the new drug is better than the
standard (HA) when in fact it is no better (H0). Ineffective drug is
deemed better.
• Traditionally  = P(Type I error) = 0.05

• Type II error - Failing to conclude that the new drug is better (HA)
when in fact it is. Effective drug is deemed to be no better.
• Traditionally a clinically important difference () is assigned
and sample sizes chosen so that:
 = P(Type II error | 1-2 = )  .20
Hypothesis Testing
Test Result – H 0 True H 0 False

True State
H 0 True Correct Type I Error
Decision

H 0 False Type II Error Correct


Decision

 = P (Type I Error )  = P (Type II Error )

• Goal: Keep ,  reasonably small


There may be four possible situations that arise
in any test procedure which have been
summaries are given below:

Actual Truth of H0

Decision
H0 is true H0 is false

Accept H0 Correct Decision Type II Error

Reject H0 Type I Error Correct Decision


Type I Error

❖ A Type I error is the mistake of


rejecting the null hypothesis when it
is true.

❖ The symbol  (alpha) is used to


represent the probability of a type I
error.
Type II Error

❖ A Type II error is the mistake of failing to


reject the null hypothesis when it is
false.

❖ The symbol  (beta) is used to represent


the probability of a type II error.
Controlling Type I &
Type II Errors

❖ For any fixed , an increase in the sample


size n will cause a decrease in 

❖ For any fixed sample size n, a decrease in 


will cause an increase in . Conversely, an
increase in  will cause a decrease in .

❖ To decrease both  and , increase the


sample size.
Hypothesis Testing Procedures
Interpreting a Decision
Example:
H0: (Claim) A cigarette manufacturer claims that less
than one-eighth of the US adult population smokes
cigarettes.

If H0 is rejected, you should conclude “there is


sufficient evidence to indicate that the manufacturer’s
claim is false.”

If you fail to reject H0, you should conclude “there is


not sufficient evidence to indicate that the
manufacturer’s claim is false.”
Elements of a Hypothesis Test
• Test Statistic - Difference between the Sample means,
scaled to number of standard deviations (standard errors)
from the null difference of 0 for the Population means:

y1 − y 2
T .S . : z obs =
2 2
s1 s2
+
n1 n2

• Rejection Region - Set of values of the test statistic that are consistent with HA,
such that the probability it falls in this region when H0 is true is  (we will always
set =0.05)

R .R. : z obs  z  = 0 .05  z = 1 .645


P-value (aka Observed Significance Level)
• P-value - Measure of the strength of evidence the sample
data provides against the null hypothesis:
P(Evidence This strong or stronger against H0 | H0 is true)

P − val : p = P ( Z  z obs )
Large-Sample Test H0:1-2=0 vs H0:1-2>0

• H0: 1-2 = 0 (No difference in population means


• HA: 1-2 > 0 (Population Mean 1 > Pop Mean 2)

y1 − y 2
• T .S . : z obs =
2 2
s1 s2
+
n1 n2
• R . R . : z obs  z 
• P − value : P ( Z  z obs )

• Conclusion - Reject H0 if test statistic falls in rejection region, or equivalently the P-


value is  
Example - Rosiglitazone for HIV-1 Lipoatrophy

• Trts - Rosiglitazone vs Placebo


• Response - Change in Limb fat mass
• Clinically Meaningful Difference - 0.5 (std dev’s)
• Desired Power - 1- = 0.80
• Significance Level -  = 0.05

z / 2 = 1 . 96 z  = z .20 = . 84

2 (1 . 96 + 0 . 84 )
2

n1 = n 2 = 2
= 63
( 0 .5 )
Source: Carr, et al (2004)
Confidence Intervals
• Normally Distributed data - approximately 95% of
individual measurements lie within 2 standard
deviations of the mean
• Difference between 2 sample means is
approximately normally distributed in large
samples (regardless of shape of distribution of
individual measurements):

  2
 2 

Y 1 − Y 2 ~ N  1 −  2 , 1
+ 2 
 n1 n2 
 
• Thus, we can expect (with 95% confidence) that our sample mean difference lies
within 2 standard errors of the true difference
(1-)100% Confidence Interval for 1-2

• Large sample Confidence Interval for 1-2:

(y )
2 2
s s
1
− y 2  z / 2 1
+ 2

n1 n2
• Standard level of confidence is 95% (z.025 = 1.96  2)
• (1-)100% CI’s and 2-sided tests reach the same conclusions regarding
whether 1-2= 0
Example - Viagra for ED
• Comparison of Viagra (Group 1) and Placebo (Group 2) for
ED
• Data pooled from 6 double-blind trials
• Subjects - White males
• Response - Percent of succesful intercourse attempts in
past 4 weeks (Each subject reports his own percentage)

y 1 = 63 .2 s1 = 41 .3 n 2 = 264
y 2 = 23 .5 s 2 = 42 .3 n 2 = 240

95% CI for 1- 2:

2 2
( 41 .3) ( 42 .3)
( 63 .2 − 23 .5 )  1 .96 +  39 .7  7 .3  (32 .4, 47 .0 )
264 240
Source: Carson, et al (2002)
P-value
• The P-value answer the question: What is the
probability of the observed test statistic or one more
extreme when H0 is true?
• This corresponds to the AUC in the tail of the
Standard Normal distribution beyond the zstat.
• Convert z statistics to P-value :
For Ha: μ > μ0  P = Pr(Z > zstat) = right-tail beyond zstat
For Ha: μ < μ0  P = Pr(Z < zstat) = left tail beyond zstat
For Ha: μ  μ0  P = 2 × one-tailed P-value
• Use Table B or software to find these probabilities
(next two slides).
Two-Sided P-Value
• One-sided Ha  AUC
in tail beyond zstat
• Two-sided Ha 
consider potential
deviations in both
directions  double
the one-sided P-value Examples: If one-sided P
= 0.0010, then two-sided
P = 2 × 0.0010 = 0.0020.
If one-sided P = 0.2743,
then two-sided P = 2 ×
0.2743 = 0.5486.
Interpretation
• P-value answer the question: What is the probability
of the observed test statistic … when H0 is true?
• Thus, smaller and smaller P-values provide stronger
and stronger evidence against H0
• Small P-value  strong evidence
Interpretation
Conventions*
P > 0.10  non-significant evidence against H0
0.05 < P  0.10  marginally significant evidence
0.01 < P  0.05  significant evidence against H0
P  0.01  highly significant evidence against H0

Examples
P =.27  non-significant evidence against H0
P =.01  highly significant evidence against H0

* It is unwise to draw firm borders for “significance”


α-Level (Used in some situations)

• Let α ≡ probability of erroneously rejecting H0


• Set α threshold (e.g., let α = .10, .05, or whatever)
• Reject H0 when P ≤ α
• Retain H0 when P > α
• Example: Set α = .10. Find P = 0.27  retain H0
• Example: Set α = .01. Find P = .001  reject H0
(Summary) One-Sample z Test
A. Hypothesis statements
H0: µ = µ0 vs.
Ha: µ ≠ µ0 (two-sided) or
Ha: µ < µ0 (left-sided) or
Ha: µ > µ0 (right-sided)
B. Test statistic
x − 0 
z stat = where SE x =
SE x n
C. P-value: convert zstat to P value
D. Significance statement (usually not necessary)
Conditions for z test
• σ known (not from data)
• Population approximately Normal or large
sample (central limit theorem)
• SRS (or facsimile)
• Data valid
Power of a Test
• Power - Probability a test rejects H0 (depends on 1- 2)
• H0 True: Power = P(Type I error) = 
• H0 False: Power = 1-P(Type II error) = 1-

· Example:
· H0: 1- 2 = 0 HA: 1- 2 > 0
•  =  =  n1 = n2 = 25
· Decision Rule: Reject H0 (at =0.05 significance level) if:

y1 − y 2 y1 − y 2
z obs = =  1 . 645  y 1 − y 2  2 . 326
 2
 2
2
1
+ 2

n1 n2
Power of a Test
• Now suppose in reality that 1-2 = 3.0 (HA is true)
• Power now refers to the probability we (correctly)
reject the null hypothesis. Note that the sampling
distribution of the difference in sample means is
approximately normal, with mean 3.0 and standard
deviation (standard error) 1.414.
• Decision Rule (from last slide): Conclude population
means differ if the sample mean for group 1 is at least
2.326 higher than the sample mean for group 2
• Power for this case can be computed as:

P (Y 1 − Y 2  2 .326 ) Y 1 − Y 2 ~ N (3, 2 .0 = 1 .414 )


Power of a Test

2 .326 − 3
Power = P (Y 1 − Y 2  2 .326 ) = P ( Z  = − 0 . 48 ) = .6844
1 .41

• All else being equal:


• As sample sizes increase, power increases
• As population variances decrease, power increases
• As the true mean difference increases, power
increases
Power of a Test
Distribution (H0) Distribution (HA)
Power of a Test

Power Curves for group sample sizes of 25,50,75,100 and varying true values 1-2
with 1=2=5.

• For given 1-2 , power increases with sample size


• For given sample size, power increases with 1-2
Power and Sample Size
Two types of decision errors:
Type I error = erroneous rejection of true H0
Type II error = erroneous retention of false H0

Truth
Decision H0 true H0 false
Retain H0 Correct retention Type II error
Reject H0 Type I error Correct rejection

α ≡ probability of a Type I error


β ≡ Probability of a Type II error
Power
• β ≡ probability of a Type II error
β = Pr(retain H0 | H0 false)
(the “|” is read as “given”)

• 1 – β = “Power” ≡ probability of avoiding a Type II


error
1– β = Pr(reject H0 | H0 false)
Power of a z test
 | 0 − a | n
1 −  =   − z1−  + 
  
 
2

where
• Φ(z) represent the cumulative probability of Standard
Normal Z
• μ0 represent the population mean under the null
hypothesis
• μa represents the population mean under the
alternative hypothesis
Calculating Power: Example
A study of n = 16 retains H0: μ = 170 at α = 0.05
(two-sided); σ is 40. What was the power of test’s
conditions to identify a population mean of 190?

 | 0 − a | n 

1 −  =  − z1−  + 
 2  
 
 | 170 − 190 | 16 
=   − 1 . 96 + 
 40 
 
=  (0 . 04 )
= 0 . 5160
Reasoning Behind Power
• Competing sampling distributions
Top curve (next page) assumes H0 is true
Bottom curve assumes Ha is true
α is set to 0.05 (two-sided)
• We will reject H0 when a sample mean exceeds 189.6 (right tail, top
curve)
• The probability of getting a value greater than 189.6 on the bottom
curve is 0.5160, corresponding to the power of the test
Sample Size Requirements
Sample size for one-sample z test:

 2
(z 1−  + z 1−  )
2

n= 2


2
where
1 – β ≡ desired power
α ≡ desired significance level (two-sided)
σ ≡ population standard deviation
Δ = μ0 – μa ≡ the difference worth detecting
Example: Sample Size Requirement

How large a sample is needed for a one-sample z test


with 90% power and α = 0.05 (two-tailed) when σ = 40?
Let H0: μ = 170 and Ha: μ = 190 (thus, Δ = μ0 − μa = 170
– 190 = −20)

 2
(z 1−  + z1−  )
2
40 (1 .28 + 1 .96 )
2 2
n= 2
= = 41 .99
 − 20
2 2

Round up to 42 to ensure adequate power.


Illustration: conditions
for 90% power.
2-Sided Tests
• Many studies don’t assume a direction wrt the
difference 1-2
• H0: 1-2 = 0 HA: 1-2  0
• Test statistic is the same as before
• Decision Rule:
• Conclude 1-2 > 0 if zobs  z (=0.05  z2=1.96)
• Conclude 1-2 < 0 if zobs  -z (=0.05  -z2= -1.96)
• Do not reject 1-2 = 0 if -z  zobs  z
• P-value: 2P(Z |zobs|)
Sample Size Calculations for Fixed Power
• Goal - Choose sample sizes to have a favorable chance of
detecting a clinically meaning difference
• Step 1 - Define an important difference in means:
• Case 1:  approximated from prior experience or pilot study - dfference
can be stated in units of the data
• Case 2:  unknown - difference must be stated in units of standard
deviations of the data

1 −  2
 =

• Step 2 - Choose the desired power to detect the the clinically meaningful difference (1-
, typically at least .80). For 2-sided test:

2 (z + z )
2
 /2
n1 = n 2 =
 2
Hypothesis Testing for Differences
Hypothesis Tests

Parametric Tests Non-parametric Tests


(Metric) (Nonmetric)

One Sample Two or More


* t test Samples
* Z test
Independent
Samples
* Two-Group t * Paired
test t test
* Z test
END

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy