0% found this document useful (0 votes)
9 views9 pages

Advanced_Stats_Final_Exam_Sample

The document contains a series of questions related to statistical analysis, including concepts of probability, chi-square tests, logistic regression, and forecasting methods. Each question presents a scenario with multiple-choice answers, focusing on the application of statistical methods to real-world problems. The questions cover a range of topics, from independence of random variables to the effectiveness of training programs and decision-making in business contexts.

Uploaded by

Airav Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views9 pages

Advanced_Stats_Final_Exam_Sample

The document contains a series of questions related to statistical analysis, including concepts of probability, chi-square tests, logistic regression, and forecasting methods. Each question presents a scenario with multiple-choice answers, focusing on the application of statistical methods to real-world problems. The questions cover a range of topics, from independence of random variables to the effectiveness of training programs and decision-making in business contexts.

Uploaded by

Airav Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Q1.

The Joint Probability Mass Func6on of 2 discrete random variables X and Y are given as follows:

!"#
𝑃 (𝑋 = 𝑥, 𝑌 = 𝑦) = + , 𝑥, 𝑦 ∈ {1,2,3}
$%
Are the variables X and Y independent?

a. Yes, because 𝑃 (𝑋 = 1, 𝑌 = 1) = 𝑃 (𝑋 = 1) ∙ 𝑃 (𝑌 = 1)
b. No, because joint ≠ product of marginals
c. Cannot be determined without sample size
d. Only independent when x = y
e. Dependent, since 𝑃 (𝑋 = 2, 𝑌 = 2) > 𝑃 (𝑋 = 2) ∙ 𝑃 (𝑌 = 2)

Q2. A sample of 300 employees who own a car among various technology companies were
ques6oned on whether they own a luxury car. The responses obtained were tabulated as given in the
table.

Company Infosys Wipro TCS CTS Tech Accenture IBM


Limited Limited Mahindra
Number of employees 10 4 7 8 3 9 9
who own a luxury car
Number of employees 30 31 38 22 27 56 46
who do not own a luxury
car

Consider the level of significance (𝛼) = 0.05 for this problem. The p-value for this problem is 0.398.
What is the value of the test sta6s6c for this dataset? And, what would be the degrees of freedom
(dof) for this dataset?

a. 𝜒! = 3.95 , 𝑑𝑜𝑓 = 8
b. 𝜒! = 6.23, 𝑑𝑜𝑓 = 6
c. 𝜒! = 10.27, 𝑑𝑜𝑓 = 4
d. 𝜒! = 6.23, 𝑑𝑜𝑓 = 4
e. 𝜒! = 14.26, 𝑑𝑜𝑓 = 8

Q3. A marke6ng team claims that customer preferences for four product colours are equally
distributed. To test this, they survey 200 customers and get the following responses:

Color Red Blue Green Yellow


Observed Frequency 65 55 40 60

What is the chi-square test sta6s6c for this data, and which test does it correspond to?

a. 9.00 — Chi-square test of independence


b. 9.00 — Chi-square goodness of fit test
c. 6.25 — Chi-square goodness of fit test
d. 10.00 — One-sample z-test
e. 8.75 — ANOVA test

Q4. A logis6cs firm claims the median delivery 6me for its premium service is 24 hours. A quality
analyst tests this by recording delivery 6mes (in hours) for 10 randomly selected premium orders: 22,
26, 24, 30, 28, 23, 25, 27, 29, 31

Using the Wilcoxon signed-rank test at a 5% significance level, the analyst calculates the test sta6s6c.
The cri6cal value for n=10 (two-tailed) is 8. Calculate the sum of posi6ve ranks (𝑊 " ), and determine
the correct conclusion?

a. 𝑊 " = 32.5; fail to reject 𝐻#


b. 𝑊 " = 27.5; reject 𝐻#
c. 𝑊 " = 40.5; reject 𝐻#
d. 𝑊 " = 35.5; fail to reject 𝐻#
e. None of the above.

Q5. A company tests the effec6veness of a new training program by comparing employee
produc6vity scores before and aier the training. The scores (out of 100) for 6 employees are:

Employee 1 2 3 4 5 6
Before 72 65 80 68 75 82
Aier 78 70 82 65 77 85

Using the Wilcoxon signed-rank test (two-tailed, α = 0.05), the cri6cal value for n = 6 is 2. What is
the sum of posi6ve ranks (𝑊 " ) and the correct conclusion?

a. 𝑊 " = 10; Fail to reject 𝐻#


b. 𝑊 " = 12; Reject 𝐻#
c. 𝑊 " = 15; Reject 𝐻#
d. 𝑊 " =18; Fail to reject 𝐻#
e. 𝑊 " = 20; Reject 𝐻#

Q6. The monthly salary of the programmer and lines of code wrilen per day by 10 programmers are
given in a table. What would be the lines of code wrilen by a programmer whose monthly salary is
$48,000? Slope (𝛽$ ) = 4.5, and intercept (𝛽# ) = -30.

Monthly salary 32.7 34.7 35.6 37.3 37.5 38.5 39.5 40.5 41.2 42.5
(in thousands of
dollars)
Lines of code 110 111 114 145 156 160 162 162 169 171
wrilen per day

a. 150
b. 165
c. 175
d. 180
e. 186
Q7. A doctor wants to predict if a pa6ent has a certain disease (1 = Yes, 0 = No) based on their age. A
logis6c regression model gives this equa6on:

Log (odds of disease) = −2 + 0.05 × (Age)

What is the predicted probability of disease (P (Y = 1)) for a 40-year-old pa6ent?

a. 12%
b. 25%
c. 50%
d. 75%
e. 90%

Q8. A hospital wants to predict whether a pa6ent is at high risk of readmission (1 = Yes, 0 = No)
based on: length of stay (in days), and number of chronic condi6ons. The logis6c model is:

Intercept (𝛽# ) = -3.2, Length of stay (𝛽$ ) = 0.15 per day, Chronic condi6ons (𝛽! ) = 0.8 per condi6on.

For a pa6ent who stayed for 6 days and with 2 chronic condi?ons, calculate (1) The log-odds of
readmission, (2) Compute their probability of admission.

a. Log-odds: -0.5; Probability: 37.8%


b. Log-odds: 0.1; Probability: 52.5%
c. Log-odds: -1.2; Probability: 23.2%
d. Log-odds: 0.8; Probability: 69.0%
e. Log-odds: -2.0; Probability: 11.9%

Q9. A medical AI model predicts whether a pa6ent has a disease (1 = Posi6ve, 0 = Nega6ve). The
confusion matrix for 200 test cases is:

Predicted Posi6ve Predicted Nega6ve


Actual Posi6ve 50 10
Actual Nega6ve 20 120

What is the F1-Score of this model?

a. 0.67
b. 0.77
c. 0.80
d. 0.83
e. 0.90

Q10. A company analyses its quarterly sales data (in $000s) for the past 3 years:

Quarter Year 1 Year 2 Year 3


Q1 80 90 85
Q2 110 120 115
Q3 100 105 95
Q4 130 140 135
What is seasonality index for Q4?

a. 1.4
b. 1.28
c. 1.24
d. 1.78
e. 2.22

Q11. A store uses exponen?al smoothing with α = 0.4 to forecast daily sales.

• Yesterday's forecast: 100 units

• Yesterday's actual sales: 120 units

What is today's forecast?

a. 100
b. 104
c. 108
d. 112
e. 120

Q12. A coffee shop recorded its daily customer counts for the past 5 days:

Day Monday Tuesday Wednesday Thursday Friday


Customers 120 150 130 140 160

Using the simple average method, what is the forecast for Saturday?

a. 120
b. 130
c. 140
d. 150
e. 160

Q13. A retailer uses a weighted average method to forecast weekly sales, assigning these weights to
the past 3 weeks:

• Most recent week (t-1): Weight = 0.5

• 2 weeks ago (t-2): Weight = 0.3

• 3 weeks ago (t-3): Weight = 0.2

Sales data:
• Week t-3: 200 units

• Week t-2: 220 units

• Week t-1: 250 units

What is the weighted average forecast for the next week?

a. 210
b. 231
c. 233
d. 235
e. 240

Q14. A farmer must choose between three crop op6ons with profits ($000s) dependent on rainfall:

Crop Drought (𝜃$ ) Normal (𝜃! ) Heavy Rain (𝜃% )


Corn (A) 20 60 40
Soyabean (B) 30 50 30
Wheat (C) 40 40 20

If the farmer uses the Maximax criterion, which crop will they choose?

a. Corn (A)
b. Soybean (B)
c. Wheat (C)
d. Indifferent between A and C
e. Cannot determine/ not enough data

Q15. Using the same crop profit table above, assume the farmer’s op?mism coefficient (α) is 0.7.

Which crop maximizes the Hurwicz weighted payoff?

a. Corn (A)
b. Soybean (B)
c. Wheat (C)
d. Indifferent between A and B
e. Cannot determine/ not enough data

Q16. A tech startup must decide whether to Develop a New Product (Cost: 200K) or upgrading new
product (Cost: 80K). Market research suggests:

1. If they Develop New Product:

o 60% chance of High Demand (Profit: $500K)

o 40% chance of Low Demand (Profit: $100K)

2. If they Upgrade Exis6ng Product:


o 70% chance of Moderate Success (Profit: $300K)

o 30% chance of Failure (Profit: $0)

Using a decision tree, which op6on has the higher expected monetary value (EMV)?

a. Develop New Product


b. Upgrade Exis6ng Product
c. Both have equal EMV
d. It depends on the accuracy of market research
e. Cannot be determined with this dataset.

!
Q17. In logis6c regression, the logit func6on is defined as log $"#!%. What is the primary purpose
of this transforma6on?

a. To ensure the dependent variable is normally distributed.


b. To model a linear rela6onship between predictors and probability.
c. To constrain predicted probabili6es between 0 and 1.
d. To minimize the sum of squared residuals.
e. To handle mul6collinearity among predictors.

Q18. In a simple linear regression model Y = 𝛽# + 𝛽$ 𝑋, what does the slope coefficient
(𝜷𝟏 ) represent?

a. The predicted value of Y when X = 0.


b. The change in Y for a one-unit increase in X, holding all else constant.
c. The strength of the correla6on between X and Y.
d. The probability that X influences Y.
e. The error term of the model.

Q19. Which of the following is not an assump6on of logis6c regression?

a. The dependent variable is binary.


b. Predictors are linearly related to the log-odds.
c. Residuals must be normally distributed.
d. No severe mul6collinearity among predictors.
e. Observa6ons are independent.

Q20. The output of a logis6c regression model for a binary classifica6on problem is:

1
𝑃 (𝑌 = 1) =
1 + 𝑒 #(%&'.)*)
How would you interpret the coefficient of X (0.5)?

a. A one-unit increase in X increases the probability of Y=1 by 50%.


b. A one-unit increase in X mul6plies the odds of Y = 1 by 𝑒 #.( .
c. X has no significant effect on Y.
d. The log-odds of Y=1 decrease by 0.5 for every unit increase in X.
e. X is nega6vely correlated with Y.

Q21. In mul6ple linear regression with two predictors (X1 and X2), how is the adjusted R – squared
different from R – squared?

a. Adjusted R-squared decreases as more predictors are added.


b. Adjusted R-squared accounts for the number of predictors and sample size.
c. Adjusted R-squared is always higher than R-squared.
d. Adjusted R-squared measures the significance of individual coefficients.
e. Adjusted R-squared ignores the effect of irrelevant predictors.

Q22. What is the key dis6nc6on between parametric and non – parametric tests?

a. Parametric tests require distribu6onal assump6on, while non-parametric tests do not.


b. Non-parametric tests always have higher sta6s6cal power.
c. Parametric tests can only be used for categorical data.
d. Non-parametric tests assume a specific popula6on distribu6on.
e. Parametric tests are only applicable to small sample sizes.

Q23. Which of the following is a non – parametric alterna6ve to the paired t – test?

a. ANOVA
b. Wilcoxon signed - rank test
c. Chi – square test
d. Pearson’s correla6on
e. Two-sample z – test

Q24. Which of the following tests is parametric?

a. Mann-Whitney U test
b. Kruskal – Wallis test
c. Spearman’s rank correla6on
d. Independent samples t – test
e. Friedman test

Q25. In the Wilcoxon signed-rank test, how are ?ed differences handled?

a. They are excluded from the analysis.


b. They are assigned the same average rank.
c. They are treated as zero differences.
d. They are given ranks randomly.
e. The test cannot be applied if 6es exist.

Q26. What is the null hypothesis of the Wilcoxon signed-rank test?

a. The two groups have equal means.


b. The distribu6ons of both groups are iden6cal.
c. The data follows a normal distribu6on.
d. The variances of the two groups are equal.
e. The median difference between pairs is zero.

Q27. Which forecas6ng method would you choose for a dataset with no trend or seasonality but
occasional outliers?

a. Seasonal Naïve method


b. Naïve method
c. Simple Average method
d. Weighted Average method
e. Exponen6al Smoothing method

Q28. The Hurwicz Method combines:

a. Minimax and Maximax criteria.


b. Op6mism (α) and pessimism (1−α) weights.
c. Only probabili6es of outcomes.
d. Historical averages.
e. None of the above.

Q29. A Decision Tree splits data based on:

a. Random selec6on of features.


b. Criteria that maximize informa6on gain or purity.
c. Linear regression coefficients.
d. Control chart thresholds.
e. None of the above.

Q30. The Maximin Criterion is used by a decision-maker who:

a. Seeks the highest possible payoff regardless of risk.


b. Focuses on minimizing the maximum possible loss.
c. Balances op6mism and pessimism using a coefficient.
d. Ignores probabili6es of outcomes.
e. Assigns weights and then take the maximum payoff value.
Correct answers for Mock Exam

1. B
2. B
3. B
4. A
5. A
6. E
7. C
8. A
9. B
10. B
11. C
12. C
13. B
14. A
15. A
16. A
17. C
18. B
19. C
20. B
21. B
22. A
23. B
24. D
25. B
26. E
27. E
28. B
29. B
30. B

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy