pdf_merge
pdf_merge
Testing of Hypothesis-I
(Session 7: 3 / 4 August 2024)
Study the following examples Study the following examples
Suppose some data is given below, now how will Example 2:
you decide to purchase the brand? Two judges have to judge independently whether the
defendant is innocent or guilty on the basis of evidence.
Brands Sample size Mean (kms) SD (kms) Lack of sufficient evidence may lead to erroneous
A 30 38600 5000 decisions like false positive or false negative. Suppose
B 40 35450 6100 based on evidences, if we are interested in finding
which Judge has committed less or more false positivity
in the judgement compared to the other Judge.
How will you decide which Judge?
Slide 7 of 137 BITS Pilani, Pilani Campus Slide 8 of 137 BITS Pilani, Pilani Campus
Slide 11 of 137 BITS Pilani, Pilani Campus Slide 12 of 137 BITS Pilani, Pilani Campus
Study the following examples Study the following examples
Suppose some data is given below, now how will Example 5:
you decide which formulae may be better? The opinion of a random sample of a few employees on
their pension plan and job classification obtained.
Formula A 13 10 8 11 8
Formula B 13 11 14 14 How to decide the pension plan and job classification
Formula C 4 1 3 4 2 4 are related/ associated?
Slide 13 of 137 BITS Pilani, Pilani Campus Slide 14 of 137 BITS Pilani, Pilani Campus
Recapitulation Recapitulation
Population of Wages (per day) of employees of an organization Sample 1
3000 2486 820 1678 2070 2638 2490 1865 1000 2090 596 3200
1861 2495 1000 2497 1865 791 2090 2637 1327 1678
1680 2858 795 2495 2496 2501 1160 1480 1860 2490 Sample 2
2090 2840 2490 2640 659 827 2646 2638 2643 868 2840 2858 3000 2490 2998 3050 2070 2896 3200 2490 3280
1327 1866 1861 2486 2865 3011 2494 1489 1865 2855
Sample 3
2840 2499 2093 2660 1165 2600 2085 2640 2998 1861 2858 3240 2497 2865 656 2093 934 1861 868 795
2956 2495 2865 1865 3000 3019 1670 2858 2642 1680
Sample 4
3038 3000 1313 596 656 3240 590 2501 2485 3015
2086 1000 2497 596 656 875 2085 934 1313
2092 1679 3024 2497 2825 2630 2070 2900 1861 2636
2495 2637 2497 1159 2640 3050 870 2896 2500 2638 Sample 5
820 1313 3000 2640 596 2640 2600 2495 934 2500
926 2860 1481 875 2482 1860 2086 934 3200 2490
Slide 17 of 137 BITS Pilani, Pilani Campus Slide 18 of 137 BITS Pilani, Pilani Campus
True (not observable) value
5
Unbiased
Sampling error
Real difference
Sampling variation
Manager
Note: H can also be stated as one-tailed
Suppose based on evidences, if we are interested in The proportion of false positive judgement by Judge
finding proportion of false positivity in the judgment of 1 may be lower than the proportion of false
two Judges
Formulate the hypotheses ???
The proportion of false positive judgement by Judge It is a statistical rule which decides whether to accept the null
hypothesis or not ?
judgement by Judge 2
Warning
parameters.
Type II
decision Error
Correct Power
Error decision
State null and alternative hypotheses
Conclusion
Define the critical region/ rejection criteria/ P - value Define the critical region/ rejection criteria/ P - value
Parametric tests: Z-test for testing (µ - µ ) Parametric tests: Z-test for testing (µ - µ )
Summary of One- and Two-Tail Tests
One-Tail Test One-Tail Test Two-Tail Test
right tail)
may be same
may be different
Reject H
Z = 2.55
t-test
observations
Suppose
Define the critical region/ rejection criteria Define the critical region/ rejection criteria
at the beginning of the month and found that the mean weight was 5.25
kg. A randomly selected 10 packets at the end of the month had a mean
Degrees of freedom 23
t -statistic (t (observed)) 2.233
P - value (1-tailed) 0.018 T.DIST.RT(t, df)
t = 2.069 Critical value (1-tailed) 1.714 T.INV( , df)
observations
t-test
a single population (µd)
Assumptions
P-value is
Finding P - Value and Critical value Excel code
Yes 21 36 30 87
df = 1, Critical value at = 0.05 is 3.841, P = 0.007
No 48 26 19 93
Inference: There may be an association between
Total 69 62 49 180
timings of accident and its outcome
Slide 111 of 137 BITS Pilani, Pilani Campus Slide 112 of 137 BITS Pilani, Pilani Campus
H0: Smoking habit and hypertension may be E1 = (r1 x c1)/n= (87 x 69)/180 = 33.35
independent (may not be associated) E2 = (r1 x c2)/n= (87 x 62)/180 = 29.97
E3 = (r1 x c3)/n= (87 x 49)/180 = 23.68
H1: Smoking habit and hypertension may not be E4 = (r2 x c1)/n= (93 x 69)/180 = 35.65
independent (may be associated)
E5 = (r2 x c2)/n= (93 x 62)/180 = 32.03
E6 = (r2 x c3)/n= (93 x 49)/180 = 25.32
Slide 113 of 137 BITS Pilani, Pilani Campus Slide 114 of 137 BITS Pilani, Pilani Campus
Chi-square test Chi-square test
Calculation of Chi-square statistic Interpretation
Sl No (Oi) (Ei)
H0: Smoking habit and hypertension may be
independent (may not be associated)
1 21 33.35 - 12.35 152.52 4.57
H1: Smoking habit and hypertension may not be
2 36 29.97 6.03 36.36 1.21
independent (may be associated)
3 30 23.68 6.32 39.94 1.69
2= 14.46
4 48 35.65 12.35 152.52 4.28
5 26 32.03 - 6.03 36.36 1.14 df = 2, Critical value at = 0.05 is 5.99, P < 0.001
6 19 25.32 - 6.32 39.94 1.58 Inference: There may be an association between
2
Total 180 180 Chi-square statistic = 14.46 smoking and Hypertension
Slide 115 of 137 BITS Pilani, Pilani Campus Slide 116 of 137 BITS Pilani, Pilani Campus
H0: The digits occur uniformly frequently in the directory Since 2 = 58.542 < 16.919 (critical value, df = 9), it
can be infer that the digits are not uniformly
H1: The digits do not occur uniformly frequently in the directory distributed.
Slide 119 of 137 BITS Pilani, Pilani Campus Slide 120 of 137 BITS Pilani, Pilani Campus
Chi-square test for Goodness-of-fit Chi-square test for Goodness-of-fit
A consultant was employed by a city council to study the The probabilities are No. of
to be calculated Oi pi Ei
arrivals
pattern of bus arrival and departure at a very busy interstate using Poisson 0 10 0.0524 10.48 0.0220
distribution with = 1 13 0.1545 30.90 10.3693
bus terminus. She collected data from the arrival of 200 buses. 2.96 and x = 0, 1, 2, 2 45 0.2277 45.54 0.0064
Based on the data, the average arrival time was found to be = 3 49 0.2238 44.76 0.4016
results are as
2.96. She divided the arrivals into 6 categories. Assuming that 4 32 0.1651 33.02 0.0315
follows:
41 0.1765 35.30 0.9204
the arrivals follow Poisson distribution test whether the arrival 2
= 11.7512
distribution follows Poisson law. Use = 0.01. Since 2
= 3.402 < 13.27 (critical value at df = k-2 = 4), it can be
infer that the arrivals and departures follow Poisson law.
Slide 121 of 137 BITS Pilani, Pilani Campus Slide 122 of 137 BITS Pilani, Pilani Campus
Exercises 1 Exercises 2
Aircrew escape system are powered by a solid propellant. The A builder claims that solar water heater are installed in 70%
burning rate of this propellant is an important product of all homes being constructed today in a city. Would you
characteristic. Specification require that the mean burning rate agree with this claim if a random sample of new homes in
must be 50 cm/s. From past experience it is known that the this city shows that 28 out of 55 had heat pumps installed?
population SD is 2 cm/s. A sample of 25 solid propellant were What P-value and confidence interval are related in this
selected randomly to reconfirm the specification stated. The situation?
sample mean found was 51.3 cm/s. At 5% level of significance
what conclusion should be drawn?
Slide 123 of 137 BITS Pilani, Pilani Campus Slide 124 of 137 BITS Pilani, Pilani Campus
Exercises 3 Exercises 4
The management of a local health club claims that its members lose on
A cigarette manufacturing company claims that its brand A
the average 7 kgs or more within 3 months after joining the club. To
cigarettes outsells its brand B cigarettes by 8%. If it is
check this claim, a consumer agency took a random sample of 15
found that 42 out of a random sample of 200 smokers members of this health club and found that they lost an average of 6.26
prefer brand A and 18 out of 100 smokers prefer brand B, kgs within the first three months of membership. The sample standard
test at 5% level of significance, whether 8% difference a deviation 1.91 kgs. Test at 1% level of significance whether the claim
made by management of a local health club is acceptable or not? Also
valid claim. Also construct 95% CI for (P1-P2) and find P-
find the P-value of this test.
value.
Slide 125 of 137 BITS Pilani, Pilani Campus Slide 126 of 137 BITS Pilani, Pilani Campus
Exercises 5 Exercises 6
A taxi company manager is trying to decide whether the use of radial tires instead of regular Three different analytical tests can be used to determine the
belted tires improves fuel economy. 12 cars were equipped with radial tires and driven over a
regular belted tires and driven once again over the test course. The gasoline consumption in procedures and the results are shown in the following
kilometers per liter was recorded. Assume that the populations are normally distributed
tabulation. Is there sufficient evidence to conclude that the three
Belted tyres 4.3 3.9 5.2 5.9 5.8 4.4 4.7 5.8 5.9 3.7 4.9 4.9
-test?
(ii) Based on the data can we conclude that cars equipped with radial tires give better fuel
Exercises 7 Exercises 8
Before
(a) Construct 95% CI for difference in means Weight 187 195 221 190 175 197 199 221 278 285
After
Exercises 9 Exercises 10
The opinion of a random sample of 500 employees are shown Examine whether there is any
below association between job
classification and pension plan
Exercises 11 Exercises 11 (contd)
Exercises 12
days