Advanced_Stats_Final_Exam_Sample
Advanced_Stats_Final_Exam_Sample
The Joint Probability Mass Func6on of 2 discrete random variables X and Y are given as follows:
!"#
𝑃 (𝑋 = 𝑥, 𝑌 = 𝑦) = + , 𝑥, 𝑦 ∈ {1,2,3}
$%
Are the variables X and Y independent?
a. Yes, because 𝑃 (𝑋 = 1, 𝑌 = 1) = 𝑃 (𝑋 = 1) ∙ 𝑃 (𝑌 = 1)
b. No, because joint ≠ product of marginals
c. Cannot be determined without sample size
d. Only independent when x = y
e. Dependent, since 𝑃 (𝑋 = 2, 𝑌 = 2) > 𝑃 (𝑋 = 2) ∙ 𝑃 (𝑌 = 2)
Q2. A sample of 300 employees who own a car among various technology companies were
ques6oned on whether they own a luxury car. The responses obtained were tabulated as given in the
table.
Consider the level of significance (𝛼) = 0.05 for this problem. The p-value for this problem is 0.398.
What is the value of the test sta6s6c for this dataset? And, what would be the degrees of freedom
(dof) for this dataset?
a. 𝜒! = 3.95 , 𝑑𝑜𝑓 = 8
b. 𝜒! = 6.23, 𝑑𝑜𝑓 = 6
c. 𝜒! = 10.27, 𝑑𝑜𝑓 = 4
d. 𝜒! = 6.23, 𝑑𝑜𝑓 = 4
e. 𝜒! = 14.26, 𝑑𝑜𝑓 = 8
Q3. A marke6ng team claims that customer preferences for four product colours are equally
distributed. To test this, they survey 200 customers and get the following responses:
What is the chi-square test sta6s6c for this data, and which test does it correspond to?
Q4. A logis6cs firm claims the median delivery 6me for its premium service is 24 hours. A quality
analyst tests this by recording delivery 6mes (in hours) for 10 randomly selected premium orders: 22,
26, 24, 30, 28, 23, 25, 27, 29, 31
Using the Wilcoxon signed-rank test at a 5% significance level, the analyst calculates the test sta6s6c.
The cri6cal value for n=10 (two-tailed) is 8. Calculate the sum of posi6ve ranks (𝑊 " ), and determine
the correct conclusion?
Q5. A company tests the effec6veness of a new training program by comparing employee
produc6vity scores before and aier the training. The scores (out of 100) for 6 employees are:
Employee 1 2 3 4 5 6
Before 72 65 80 68 75 82
Aier 78 70 82 65 77 85
Using the Wilcoxon signed-rank test (two-tailed, α = 0.05), the cri6cal value for n = 6 is 2. What is
the sum of posi6ve ranks (𝑊 " ) and the correct conclusion?
Q6. The monthly salary of the programmer and lines of code wrilen per day by 10 programmers are
given in a table. What would be the lines of code wrilen by a programmer whose monthly salary is
$48,000? Slope (𝛽$ ) = 4.5, and intercept (𝛽# ) = -30.
Monthly salary 32.7 34.7 35.6 37.3 37.5 38.5 39.5 40.5 41.2 42.5
(in thousands of
dollars)
Lines of code 110 111 114 145 156 160 162 162 169 171
wrilen per day
a. 150
b. 165
c. 175
d. 180
e. 186
Q7. A doctor wants to predict if a pa6ent has a certain disease (1 = Yes, 0 = No) based on their age. A
logis6c regression model gives this equa6on:
a. 12%
b. 25%
c. 50%
d. 75%
e. 90%
Q8. A hospital wants to predict whether a pa6ent is at high risk of readmission (1 = Yes, 0 = No)
based on: length of stay (in days), and number of chronic condi6ons. The logis6c model is:
Intercept (𝛽# ) = -3.2, Length of stay (𝛽$ ) = 0.15 per day, Chronic condi6ons (𝛽! ) = 0.8 per condi6on.
For a pa6ent who stayed for 6 days and with 2 chronic condi?ons, calculate (1) The log-odds of
readmission, (2) Compute their probability of admission.
Q9. A medical AI model predicts whether a pa6ent has a disease (1 = Posi6ve, 0 = Nega6ve). The
confusion matrix for 200 test cases is:
a. 0.67
b. 0.77
c. 0.80
d. 0.83
e. 0.90
Q10. A company analyses its quarterly sales data (in $000s) for the past 3 years:
a. 1.4
b. 1.28
c. 1.24
d. 1.78
e. 2.22
Q11. A store uses exponen?al smoothing with α = 0.4 to forecast daily sales.
a. 100
b. 104
c. 108
d. 112
e. 120
Q12. A coffee shop recorded its daily customer counts for the past 5 days:
Using the simple average method, what is the forecast for Saturday?
a. 120
b. 130
c. 140
d. 150
e. 160
Q13. A retailer uses a weighted average method to forecast weekly sales, assigning these weights to
the past 3 weeks:
Sales data:
• Week t-3: 200 units
a. 210
b. 231
c. 233
d. 235
e. 240
Q14. A farmer must choose between three crop op6ons with profits ($000s) dependent on rainfall:
If the farmer uses the Maximax criterion, which crop will they choose?
a. Corn (A)
b. Soybean (B)
c. Wheat (C)
d. Indifferent between A and C
e. Cannot determine/ not enough data
Q15. Using the same crop profit table above, assume the farmer’s op?mism coefficient (α) is 0.7.
a. Corn (A)
b. Soybean (B)
c. Wheat (C)
d. Indifferent between A and B
e. Cannot determine/ not enough data
Q16. A tech startup must decide whether to Develop a New Product (Cost: 200K) or upgrading new
product (Cost: 80K). Market research suggests:
Using a decision tree, which op6on has the higher expected monetary value (EMV)?
!
Q17. In logis6c regression, the logit func6on is defined as log $"#!%. What is the primary purpose
of this transforma6on?
Q18. In a simple linear regression model Y = 𝛽# + 𝛽$ 𝑋, what does the slope coefficient
(𝜷𝟏 ) represent?
Q20. The output of a logis6c regression model for a binary classifica6on problem is:
1
𝑃 (𝑌 = 1) =
1 + 𝑒 #(%&'.)*)
How would you interpret the coefficient of X (0.5)?
Q21. In mul6ple linear regression with two predictors (X1 and X2), how is the adjusted R – squared
different from R – squared?
Q22. What is the key dis6nc6on between parametric and non – parametric tests?
Q23. Which of the following is a non – parametric alterna6ve to the paired t – test?
a. ANOVA
b. Wilcoxon signed - rank test
c. Chi – square test
d. Pearson’s correla6on
e. Two-sample z – test
a. Mann-Whitney U test
b. Kruskal – Wallis test
c. Spearman’s rank correla6on
d. Independent samples t – test
e. Friedman test
Q25. In the Wilcoxon signed-rank test, how are ?ed differences handled?
Q27. Which forecas6ng method would you choose for a dataset with no trend or seasonality but
occasional outliers?
1. B
2. B
3. B
4. A
5. A
6. E
7. C
8. A
9. B
10. B
11. C
12. C
13. B
14. A
15. A
16. A
17. C
18. B
19. C
20. B
21. B
22. A
23. B
24. D
25. B
26. E
27. E
28. B
29. B
30. B