A2 Mock
A2 Mock
1. Which of the following statements about the central limit theorem is most appropriate?
a. As sample size increases, the distribution of a random variable approaches a normal
distribution, regardless of the underlying distribution of the variable.
b. Provided that the 𝑛𝜋 and 𝑛(1 − 𝜋) are each at least five, the sampling distribution of
proportion may be approximated with a normal distribution.
c. The sampling distribution of the average will be approximately normally distributed
provided that the sample size is 30 or more.
d. Good estimators of a parameter value are unbiased, consistent and efficient.
2. Identify the most inappropriate statement about the finite population correction (FPC) from
the following options.
a. It should be considered when sampling from a finite population without replacement.
b. The impact of the FPC factor is to adjust the standard error downwards (that is,
decrease the standard error).
c. It generally becomes important to accommodate the FPC factor when the sample size
is at least 5% of a finite population and sampling is done without replacement.
d. None of the above.
4. You estimated that P(𝑋̅ > 𝑎) = 0.4825. Assume that 𝑋̅ ~ Normal(𝜇, 𝜎 2 ⁄𝑛). Select the
correct statement from the following.
a. P(𝑋̅ < 𝑎) = −0.4825
𝑥̅ − 𝜇
b. P (𝑍 < ) = 0.4825
√𝜎 2 ⁄𝑛
c. 𝑎 < 𝜇
𝑥̅ − 𝜇
d. 𝑍 = >0
√𝜎 2 ⁄𝑛
9. All the statements given below are true except one. Identify the odd statement.
a. The sample size required to ensure that a 90% confidence interval is estimated to
within a margin of error of 17 units should be less than the size required for an 80%
confidence interval, provided that all other quantities remain constant.
b. Setting 𝑝 = 0.50 is a conservative choice to adopt when no prior information about
the proportion of interest is available.
c. The sample size required for a specific margin of error and a set confidence interval is
less when the population is finite compared to when the population is infinite.
d. Interval of error = 2 × Margin of error = (Upper − Lower) confidence bound.
12. A hypothesis test was conducted and the conclusion was to fail to reject the null hypothesis.
Which of the following outputs are most likely to have been obtained from the test?
a. H0 : 𝜋 = 0.35; 𝑧𝑠𝑡𝑎𝑡 = −1.410; 𝑧𝑐𝑟𝑖𝑡 = ±1.645; Confidence Interval: [0.07, 0.36]
b. H0 : 𝜋 ≠ 0.35; 𝑧𝑐𝑟𝑖𝑡 = −1.410; 𝑧𝑠𝑡𝑎𝑡 = −1.645; Confidence Interval: [0.07, 0.36]
c. H0 : 𝜋 = 0.35; 𝑡𝑠𝑡𝑎𝑡 = −1.410; 𝑡𝑐𝑟𝑖𝑡 = ±1.645; Confidence Interval: [0.36, 0.70]
d. None of the options above is consistent with the conclusion from the test.
13. You were informed that the decision of a hypothesis test is that the null hypothesis was
rejected. What can you say about the tolerance for Type I error and the p-value for the test?
a. It must have been an upper-tail test and the p-value < 𝛼.
b. The test must have been two-tail test and the confidence interval would not have
included the value specified in the null hypothesis.
c. Regardless of the nature of the test, it must have been that the p-value < 𝛼.
d. The probability of observing the test statistic or any value more extreme is less than
a pre-specified threshold.
14. All the following statements are accurate with respect to possible test outcomes and the
relationships between them, except one. Select the odd one.
a. Confidence calculation depends on the null hypothesis being true.
b. Type I error and Type II error are inversely related and are also mutually exclusive.
c. When the ability to reject the null hypothesis given that it is false increases, the power
of the test increases.
d. None of the above.
A study was recently commissioned to verify the alternative hypothesis that the pass rate for SDS188
is less than 68%. A random sample of 28 students registered for the course was collected to verify the
hypothesis. Questions 15 – 17 refer to this scenario.
15. Which of the following best describes a case where Type II error is being committed?
a. It was concluded that the pass rate for the course is not less than 68% when 65% is
the true pass rate.
b. The conclusion of the test was that the pass rate of SDS188 is more than 68% when
the true pass rate 70%.
c. It was incorrectly claimed that the pass rate of the module is less than 68%.
d. Incorrectly applying the central limit theorem approximation when less than 30
students were sampled.
16. Identify which of the following statements describes the confidence of the test.
a. It was concluded that the sample data contained insufficient evidence for the null
hypothesis to be rejected. It was then claimed that SDS188 pass rate is not less than
68%.
b. It was correctly inferred that the pass rate for the course is not less than 68%.
c. The decision was to accept that the pass rate of SDS188 is 68% because the confidence
interval does not contain the zero.
d. Bootstrapping was carefully applied to account for the size of the sample being less
than 30 students.
17. The decision from the hypothesis test was to fail to reject the null hypothesis. It was
subsequently reported that an estimate of the standardised effect size (SES) for the test is
available. Which of the following best characterises possible implication of the SES estimate?
a. It is highly likely that, if 54 students were sampled and the SES estimate was 87.89%,
the conclusion will be that the pass rate for SDS188 is not less than 68%.
b. It is highly unlikely that, if 54 students were sampled and the SES estimate was
19.43%, there will be insufficient evidence in the sampled data to be able to claim that
the pass rate for SDS188 is not less than 68%.
c. Both statements a. and b. are correct.
d. None of statements a. and b. is correct.
The Department of Statistics and Actuarial Science at the university is interested in comparing the
performances of two teaching methods, A and B. Different approaches are being considered. Final
assessment score can be assumed to be normally distributed.
I. Randomly assign students to the two teaching methods and compare the average
grades from a final assessment written by all participants when the experiment ends.
II. Carefully expose all students to both teaching methods, in a way that minimises bias.
At the end of the experiment, the average of the differences in the scores obtained
after Method A and that obtained after Method B is then analysed.
III. Allow all students participate in the two teaching methods in a way that bias is
minimised. Then ask students whether they prefer one method to the other.
IV. Assign students to the two methods randomly and compare the variability in the final
grades.
Questions 19 – 23 refer to this scenario.
19. Which of the following null hypothesis statements is most applicable to Approach II?
a. 𝜇𝐴 − 𝜇𝐵 = 0
b. 𝜋𝐴 − 𝜋𝐵 = 0
c. 𝜇𝐴−𝐵 = 0
d. 𝜎𝐴2 ÷ 𝜎𝐵2 = 1
20. What is the most appropriate statement that can be made about the following expression?
(𝑛𝐴 − 1)𝑠𝐴2 + (𝑛𝐵 − 1)𝑠𝐵2
𝑠∗2 =
(𝑛𝐴 − 1) + (𝑛𝐵 − 1)
a. It is the pooled variance and it is applicable in Approach I and Approach II when the
population variances are unknown but can be assumed to be equal.
b. It is the pooled variance and it is only applicable in Approach I when the population
variances are unknown but are assumed to be equal.
c. It is an average variance expression and it is useful whenever the unequal variance t-
test is not applicable.
d. None of the above.
21. For which analysis of the data collected from the different scenarios is the t-test applicable?
Select the most appropriate option.
a. All the approaches, provided that population variance is unknown.
b. Approach I and Approach II only, if a standard deviation estimate had to be obtained
from the sample.
c. Approach I, Approach II and Approach IV only, when the population variance is
unknown but an estimate of degree of freedom can be obtained from the sample.
d. None of the above.
23. Which of the following provides an accurate expression for the coefficient of determination?
a. ∑(𝑦𝑖 − 𝑦̂𝑖 )2 ⁄∑(𝑦𝑖 − 𝑦̅)2
b. ∑(𝑦𝑖 − 𝑦̂𝑖 ) ⁄∑(𝑦𝑖 − 𝑦̅)
c. ∑(𝑦̂𝑖 − 𝑦̅) ⁄∑(𝑦𝑖 − 𝑦̅)
d. ∑(𝑦̂𝑖 − 𝑦̅)2 ⁄∑(𝑦𝑖 − 𝑦̅)2
24. The following alternative hypothesis statement best corresponds which of the following tests?
𝛽1 ≠ 0 or 𝛽2 ≠ 0 or 𝛽3 ≠ 0
a. F-test in a multiple linear regression that contains two independent (X) variables and
an intercept term.
b. t-test with respect to the slope coefficient in a simple linear regression setting.
c. F-test in a multiple linear regression with an intercept term and three X variables. A
possible combination of the X variables is: one continuous variable and two dummy
variables.
d. Z-test in a multiple linear regression and, if rejected, it implies that all the slope
coefficients in the multiple linear regression are insignificantly different from zero.
It is common among university students to boast about their residences being home to only the finest
brains. Therefore, the university set out to understand if there is any relationship between year-end
grade (measured in %), age (measured in years) and student residence. Three residences were
considered: residences U, V and W. The sample included 39 student records. The following regression
equations were obtained from the study. It can be assumed that student year-end grade is normally
distributed. Questions 25 – 29 are based on this scenario.
̂ = 46 + 3.1 × Age
Grade (Equation I)
̂
Grade = 42 + 2.5 × Age + 0.98 × V + 1.33 × W (Equation II)
25. Which of the following represents an appropriate interpretation that may be deduced from
Equation (I)?
a. When age increases by a year, year-end grade increases by about 49.1, on average.
b. When year-end grade increases by 3.1%, age increases by a year.
c. A one-year increase in age causes year-end grade to increase by about 3.1%.
d. None of the above.
26. The following are attempts by different students to interpret one of the coefficients from
Equation (II). Which of the interpretations is/are accurate?
i. Given two students of the same age that stay in different residence, the student that
stays in Residence V is expected to have a year-end score of about 0.98% more,
compared to a student is Residence U.
ii. For a pair of students that stay in the same residence and differ only in age by one
year, the year-end score of the older student is expected to be about 2.5% higher.
iii. Compared to a student in Residence V, a student in Resident W is expected to have a
year-end score of about 0.50% more, if age is held constant.
iv. The average score for all students in all the residences considered is about 42%
regardless of the age.
a. Only i. and ii. are correct
b. Only i., ii. and iii. are correct.
c. Only ii., iii. and iv. are correct.
d. All the statements above are correct.
27. Assume that there was not enough evidence in the sampled data to be able to reject any of
null hypotheses verified by the t-tests for the dummy variables. Which of the following
statements with respect to the coefficients of determination from the two regression
equations is true?
a. The 𝑟 2 and adjusted 𝑟 2 values for Equation (I) will be less than those of Equation (II).
b. The 𝑟 2 for Equation (II) will be greater than that of Equation (I) but the adjusted 𝑟 2 for
Equation (I) will be greater.
c. The adjusted 𝑟 2 for Equation (II) will be greater than that of Equation (I) but the 𝑟 2 for
Equation (I) will be less.
d. The 𝑟 2 and adjusted 𝑟 2 values for Equation (II) will be less than those of Equation (I).
28. Which of the following is the most appropriate degree of freedom for the t-test with respect
to the null hypothesis: 𝛽𝑉 = 0?
a. 3
b. 35
c. 36
d. 38
SDS188 A2S2 Preparation II Solution
1. C 5. A 9. A 13. C 17. D 21. B 25. D