0% found this document useful (0 votes)
143 views5 pages

MH3511 Data Analysis With Computer: Lab 5 (Solution) AY2019/20 Semester 2

This document contains solutions to exercises from a data analysis lab involving hypothesis testing on proportions and means using normal approximations and t-tests. The exercises involve: 1) checking normality of wind speed data and testing the mean, 2) testing the proportion of cars with issues against a claimed value, and 3) determining type I error rates and power of a test with various sample sizes and alternative hypotheses. R code is provided to perform the relevant statistical tests and calculations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
143 views5 pages

MH3511 Data Analysis With Computer: Lab 5 (Solution) AY2019/20 Semester 2

This document contains solutions to exercises from a data analysis lab involving hypothesis testing on proportions and means using normal approximations and t-tests. The exercises involve: 1) checking normality of wind speed data and testing the mean, 2) testing the proportion of cars with issues against a claimed value, and 3) determining type I error rates and power of a test with various sample sizes and alternative hypotheses. R code is provided to perform the relevant statistical tests and calculations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

MH3511 Data Analysis with Computer

Lab 5 (Solution) AY2019/20 Semester 2

Exercise 5.1
A person has been trained to set the bean grinder so that a 25-second expresso shot results in 2
ounces of espresso. He pours thirteen shots and measures the amounts to be

1.95, 1.78, 2.10, 1.82, 1.73, 2.01, 1.83, 1.90, 2.05, 1.85, 1.96, 1.98, 1.79

a) Examine whether the data are roughly normally distributed using QQ-plot and Shapiro-
Wilk’s test.
b) Use an R code similar to the following to compare three results of the two-sided 90%
confidence interval using the normal approximation, t-distribution approximation, and using
the t.test() function.

coffee <- c(1.95, 1.78, 2.10, ……)


n <- length(coffee)
xbar <- mean(coffee)
s <- sd(coffee)

alpha <- 0.10


z<- qnorm(1-alpha/2)
t <- qt(1-alpha/2, df=n-1)

zCI90p<- c(xbar-z*s/sqrt(n), xbar+z*s/sqrt(n))

print(paste("n=",n, "; xbar=", xbar, "; s=", s ))


print(paste("alpha=",alpha, "; z=", z, "; t=", t ))
print(paste("z90% CI= [", zCI90p[1], zCI90p[2], "]"))

t.test(coffee, conf.level=0.9)
c) Find the one-sided 90 % CI of the form ¿, using normal approximation, t-distribution
approximation, and using the t.test() function. Compare the three results.

>
>
> coffee<- c(1.95, 1.78, 2.10, 1.82, 1.73, 2.01, 1.83, 1.90, 2.05,
1.85, 1.96, 1.98, 1.79)
>
> qqnorm(coffee)
> qqline(coffee)
> shapiro.test(coffee)

Shapiro-Wilk normality test

data: coffee
W = 0.96491, p-value = 0.8271

> n <- length(coffee)


> xbar <- mean(coffee)
> s <- sd(coffee)
>
> alpha <- 0.10
> z<- qnorm(1-alpha/2)
> t <- qt(1-alpha/2, df=n-1)
>
> zCI90p<- c(xbar-z*s/sqrt(n), xbar+z*s/sqrt(n))
> tCI90p<- c(xbar-t*s/sqrt(n), xbar+t*s/sqrt(n))
>
> print(paste("n=",n, "; xbar=", xbar, "; s=", s ))
[1] "n= 13 ; xbar= 1.90384615384615 ; s= 0.114056890887725"
> print(paste("alpha=",alpha, "; z=", z, "; t=", t ))
[1] "alpha= 0.1 ; z= 1.64485362695147 ; t= 1.78228755564932"
> print(paste("z90% CI= [", zCI90p[1], zCI90p[2], "]"))
[1] "z90% CI= [ 1.85181336431625 1.95587894337605 ]"
> print(paste("t90% CI= [", tCI90p[1], tCI90p[2], "]"))
[1] "t90% CI= [ 1.84746582203709 1.96022648565522 ]"
>
> t.test(coffee, conf.level=0.9)

One Sample t-test

data: coffee
t = 60.184, df = 12, p-value = 2.928e-16
alternative hypothesis: true mean is not equal to 0
90 percent confidence interval:
1.847466 1.960226
sample estimates:
mean of x
1.903846
Exercise 5.2
The existing dataframe “airquality” in R contains daily air quality measurements in New York, May to
September 1973, with 154 observations on 6 variables, where variable “Wind” is the average wind
speed in m/s (meters per second).

I. Subset the dataframe and use a QQplot to check if the wind speed data in August and
September are approximately normally distributed.
II. One wants to test hypothesis that the mean wind speed is 10 m/s during August and
September.
a. What are the null and alternative hypotheses?
b. Find the p-value of this test using three methods: normal approximation, t
approximation and t.test() function.
c. Are p-values obtained from the three methods the same or very similar? Why?

Let μ be the mean wind speed in August and September.


We want to test H 0 :μ=10 against H 1 : μ ≠ 10.
x́−μ
The test statistics is , for both normal and t- approximations.
s/√n
Exercise 5.3
Historically, a car from a given company has a 10% chance of having a significant mechanical
problem during its warranty period. A new model of the car is being sold. Of the first 25,000 sold,
2,700 have had an issue. Perform a test of significance to see whether the proportion of these new
cars that will have a problem is more than 10%. What is the p-value? (Use both normal
approximation and prop.test() function in R.)

We test H 0 : p=0.1 against H 1 : p>0.1


We know that
^
P− p
has an≈. N ( 0,1 ) distirbution.
√ p(1− p)/n
2700
Now, we observe ^p= =0.108 , the p-value is
25000
0.108−0.1
^ 0.108∨H 0 ) =Pr Z>
Pr ( P>
(
√ 0.1(1−0.1)/25000 )
=Pr ( Z > 4.216 )

Exercise 5.4
Let { X 1 , X 2 , X 3 , … , X n } be a random sample from a normal population of size n . We are interested
in testing H 0 : X N (0 , 1) against H 1 : X N (μ 1 , 1), where μ1 >0.

Suppose we apply the decision rule so that we reject H 0 if x́ ≥ 0.5 , otherwise do not reject H 0.

a) For various values of n (10, 11, 20 and 30) use a R code to determine the probability of Type
I error. at what value of n would the probability of type I error be approximately 0.05?
b) For n=11, determine the powers of this test when μ1=1.0 , 1.1 ,1.2∧1.3.

X́−μ
Note that when X N (μ , 1) , N ( 0,1 ) ,
1/ √ n
Under H 0 : X N (0 , 1), the p-value is

Pr ( X́ ≥ 0.5 ) =Pr ( 1/X́−0√ n ≥ 0.5−0


1 / √n )
=Pr ( Z ≥ 0.5∗√ n )
When n=11, under H 1 : X N (μ 1 , 1), the power of the test is
X́ −μ 1 0.5−μ 1
1−Pr ( X́ <0.5 )=1−Pr
( 1
√ 11
<
1
√ 11
)=1−Pr ( Z <(0.5−μ 1)∗√ 11 )

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy