0% found this document useful (0 votes)

76 views19 pages

DS II Mid Term 2017 Solution

1) The regression model between per capita income and corruption index explains 74.3% of the variation in per capita income. There is a statistically significant relationship between the two variables. 2) At a 95% confidence level, the minimum average value of per capita income when corruption index is 50 is 25552.6836 dollars. 3) The average per capita income of communist states based on the regression model is 22912.282 dollars. However, the model is not statistically significant due to heteroscedasticity in the residuals.

Uploaded by

kapadia krunal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views19 pages

DS II Mid Term 2017 Solution

Uploaded by

kapadia krunal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

1

Decision Sciences II
Mid-Term Examination (Solution)
Wednesday, October25, 2017
Time : 180 minutes
Total No. of Pages :20
Name________________________
Total No. of Questions: 3 Roll No. ________________________
Total marks:55 Section ________________________

Instructions
1. This is a closed book exam. You are NOT allowed to use text book and class notes.
2. Answer all questions only in the space provided following the question.
3. Show all work and give adequate explanations to get full credit.
4. You may use the backside of the last page for rough work only if needed. Do NOT attach any
rough work/sheets.
5. Encircle or underline your final answer for each part.
6. No clarifications will be made during the exam.
7. Assume 95% confidence level if necessary ( = 0.05).
8. Use approximate critical values for Z, t, F, and 2 tests if the exact value is not available in the
tables attached with the question paper.

Question Q1 Q2 Q3
Number
Max Marks 20 15 20 Total
Marks Scored
2

Question 1 (20 points)

Per Capita Income of 20countries were analysed using the variables described in Table 1.

Table 1. Data Dictionary

S.No Variable Variable Type Code in SPSS output
1 Per Capita Income (in Numerical Per Capita
Dollars)
2 Corruption Index (Higher Integer CI
Value indicates lower
level of corruption in the
country)
3 Gini Index (Measure of Numerical Gini
Wealth Distribution and
Discrimination)
4 Communist State Binary CS
(Whether the county 1 = Communist State; 0 otherwise
was/is a communist state)

Descriptive statistics of the variables and correlations are shown in Tables 2 and 3 respectively.

Table 2 Descriptive Statistics

Std.
N Minimum Maximum Mean
Deviation
CI 20 29.0 90.0 61.700 20.6171
Gini 20 23.5 53.7 34.740 7.3846
CS 20 .0 1.0 .250 .4443
PerCapita 20 12275.0 69249.0 37789.050 15847.4829
Valid N
20
(listwise)

Table 3 Correlations
CI Gini CS PerCapita
CI 1 -.464* -.612** .862**
Gini -.464* 1 .253 -.338
CS -.612** .253 1 -.556*
Per Capita .862** -.338 -.556* 1

Model 1
Y (Per Capita) = 0 + 1 x CI

A simple linear regression model (Model 1) is developed between per capita (Y) and corruption
index (CI).SPSS model outputs are shown in Tables 4 and 5. Normal P-P Plot and Residual Plot
are shown in Figures 1 and 2 respectively.
3

Table 4 Model Summaryb

Model R R Square Adjusted R Std. Error of the
Square Estimate
1 8241.4390
a. Predictors: (Constant), CI
b. Dependent Variable: Per Capita

Table 5 Coefficientsa
Model Unstandardized Coefficients Standardized t Sig.
Coefficients
B Std. Error Beta
(Constant) -3112.753 5950.818 -.523 .607
1
CI 662.914 91.706 .862
a. Dependent Variable: Per Capita

Figure 1 Normal Probability Plot Figure 2 Residual Plot

Question 1.1 (1 points)
What proportion of the variation in per capita income is explained by corruption index (CI)?

SOLUTION:
r = 0.862 (correlation value of percapita and CI)
R2 = r2 = (0.862)2 = 0.743
74.3 % of variation in percapita income is explained by CI.
4

Question 1.2 (1 point)

Is there a statistically significant relationship between corruption index and per capita income of
the countries at 5% significance?

SOLUTION:
We can use either F test or t test for this problem.
H0 : β1 = 0
H1 : β1 ≠ 0
T test: F test:
t- stat = B1 - β1 / SE (B1) = 7.22 F = (SSR/k) / (SSE/ (n-k-1) ) = 52.05
t-critical (0.025,18) = 2.101 F-critical(0.05,1,18) = 4.414
t-stat > t-critical, we reject null hypothesis. F > F-critical, we reject null hypothesis.
There is a statistically significant relationship between percapita income and
CI at 5% significance.

Question 1.3 (2 points)

Is it possible to conclude that the per capita income increases by at least 500 dollars for every one
unit increase in corruption index at 10% significance level? Clearly write all the steps.

SOLUTION:
H0 : β1< 500
H1 : β1 ≥ 500
T test:
t- stat = B1 - β1 / SE (B1) = 1.776
t-critical(0.1,18) = 1.33 (since one tailed test)
t-stat > t-critical, we reject null hypothesis.
Therefore, the per capita income increases by at least 500 dollars for every one
unit increase in corruption index at 10% significance level.
5

Question 1.4 (1 Point)

What can you conclude about model between per capita and CI based on the plots in Figures 1
and 2?

SOLUTION:
Figure 1 : The error follows approximately normal distribution.
Figure 2 : The residual plot shows no pattern and therefore we can conclude that there is
homoscedasticity.

Question 1.5 (3 Points)

What is the minimumaverage value of per capita at 95% confidence interval when CI = 50?

SOLUTION:

^ 1 ( X i  X )2
Y i  t / 2, n  k 1 * se * 
n SSX

^
Y i  30032.947

tα/2, 18 = 2.101

se = 8241.439

Xi = 50 X = 61.70

SSX = (n-1)* SD2 = 8076.23

Minimum average value = 25552.6836

A second model is developed between Per Capita and Communists States (CS).
Model 2
Y (Per Capita) = 0 + 1 x CS
The outputis shown in Table 6.Normal probability and residual plots are shown in Figures 3 and
4 respectively.
Table 6 Coefficientsa
Model Unstandardized Coefficients Standardized t Sig.
Coefficients
B Std. Error Beta
(Constant) 42743.933 3495.319 12.229 .000
1
CS 6990.639 -.556
a. Dependent Variable: PerCapita
6

Figure 3 Normal Probability Plot

Figure 4 Residual Plot
Question 1.6 (2 Points)
Calculate the average per capita of communist states. Clearly write all the steps?

SOLUTION:
^ 
Y   0  1 * X

β1 = Std β * ( SY / SX ) = -19831.646
β0 = 42743.933

X = 1 (for communist states)
^
Y = 22912.282

Question 1.7(2 points)

Is model 2 statistically significant at 5% significance, use all the information (Table 6, Figures 3
and 4) provided. Clearly write all the arguments.

SOLUTION:
From table 6, p-value is 0.
When p-value < 0.05, we can say that the model is significant but from Figure 4, it is clear that
there exists heteroscedasticity. So, the p-value may not be reliable.
7

A stepwise regression model is developed using CI and Gini as independent variables and the
outputs are shown in Table 7.
Table 7 Stepwise Regression Output
Model Unstandardized t Sig. Correlations
Coefficients
B Std. Error Zero-order Partial Part
(Constant) -3112.753 5950.818 -.523 .607
1
CI 662.914 91.706 7.229 .000 .862 .862 .862
(Constant) -10781.284 14572.250 -.740 .469
2 CI 691.235 105.487 6.553 .000 .862 .846 .797
Gini -.338 .139 .070

Question 1.8 (2 Points)

What is the value of R-square after adding the variable Gini to the model?

SOLUTION:
R2 after adding gini = R2 of model without gini + ( part correlation of gini in new model)2
= 0.7479

Question 1.9 (2 points)

Carry out an appropriate hypothesis test to check whether the variable “Gini” is worth adding to
the model at 5% significance.

SOLUTION:
Use partial F test for this problem.
H0 : β1 = 0
H1 : β1 ≠ 0
Partial F test :
( RF2  RR2 ) / r
 = 0.33048
(1  RF2 ) / (n  k  1)
F critical (0.05, 1, 17) = 4.451

F < F-critical, we do not reject null hypothesis.

Thus, adding gini is not worthy.
8

Question 1.10 (2 points)

Calculate the variance inflation factor between variables CI and Gini. What can you conclude
from the calculated VIF value?

SOLUTION:
VIF = 1 / (1 - R2) = 1.27436
(R = correlation value of CI and gini)
Since VIF <4, no multicollinearity between CI and gini.

A stepwise regression model is developed using all the 3 independent variables and the SPSS
outputs are given in Tables 8 and 9
Table 8 Coefficientsa
Model Unstandardized Coefficients Standardized t Sig.
Coefficients
B Std. Error Beta
(Constant) -3112.753 5950.818 -.523 .607
1
CI 662.914 91.706 .862 7.229 .000
a. Dependent Variable: PerCapita

Table 9 Excluded Variablesa

Model Beta In t Sig. Partial Collinearity
Correlation Statistics
Tolerance
Gini .079b .579 .570 .139 .785
1 b
CS -.044 -.287 .777 -.070 .625
a. Dependent Variable: PerCapita
b. Predictors in the Model: (Constant), CI

Question 1.11 (2 points)

Based on the information provided in Tables 8 and 9, is it possible to conclude that there is no
statistically significant relationship between Per Capita and independent variables Gini and CS?
Excluded variables in Table 9 are variables that are not part of the regression model (statistically
not significant) when stepwise regression is used.

SOLUTION:
We have to do F test for percapita,gini and percapita,CS.
F test:
F = R2 / ( (1 - R2)/18 )
9

Percapita,gini Percapita,CS
H0 : β1 = 0 H0 : β1 = 0
H1 : β1 ≠ 0 H1 : β1 ≠ 0

F (gini) = 2.3215 F (CS) = 8.0543

F-critical (0.05, 1, 18) = 4.414

F (gini) < F-critical

So, we accept null hypothesis. Thus, gini is not significant.

F(CS) > F-critical

So, we reject null hypothesis. Thus, CS is significant.
10

Question 2 (15 points)

Applicants who apply for a job at Precision Watches Inc., which requires extensive manual
assembly of small intricate parts, are initially given three different tests to measure their manual
dexterity. The ones who are hired are then periodically given a performance rating on a 0-100
scale that combines their speed and accuracy in performing the required assembly operations.
Data is collected on the test scores and performance ratings for a randomly selected group of 80
employees who continued working for the company. Their seniority (months with the company)
at the time of the performance rating is also noted. The summary information and the results from
four regression models developed using the data are given below:
Pairwise Correlation Matrix
JobPerf Seniority Test1 Test2 Test3
JobPerf 1
Seniority 0.43 1.00
Test1 0.58 1.00
Test2 0.52 0.60 1.00
Test3 0.62 0.66 0.80 1.00
Descriptive Statistics
Minimu Maximu Std.
N m m Mean Deviation
JobPerf 80 38 100 65.75 10.630
Seniority 80 7 30 18.89 5.00
Test1 80 31 82 60.53 9.576
Test2 80 37 86 60.75 9.872
Test3 80 26 77 50.71 9.181
Valid N
80
(listwise)
Model 1 Summary
Std. Error
R Adjusted of the Durbin-
Model R Square R Square Estimate Watson
1 .176 9.651 1.856
a Predictors: (Constant), Seniority
b Dependent Variable: JobPerf

Model 1 ANOVA
Sum of Mean
Model Squares Df Square F Sig.
1 Regression 1662.584 1 1662.584 17.852 .000
Residual 7264.416 78 93.134
Total 8927.000 79

Model 1 Coefficients

Model Unstandardized Standardized t Sig.

Coefficients Coefficients

B Std. Error Beta

1 (Constant) 11.8
48.928 4.125 .000
61
Seniority .891 .432

Model 2 Summary
Adjusted Std. Error
R R of the Durbin-
Model R Square Square Estimate Watson
1 .764 .583 .561 7.042 1.878
a Predictors: (Constant), Test3, Seniority, Test1, Test2
b Dependent Variable: JobPerf

Model 2 ANOVA
Sum of Mean
Model Squares Df Square F Sig.
1 Regression 5208.110 4 1302.027 26.258 .000
Residual 3718.890 75 49.585
Total 8927.000 79

Model 2 Coefficients
Standardize
Mode Unstandardized d Collinearity
l Coefficients Coefficients t Sig. Statistics
Std. Toleranc
B Error Beta e VIF
1 (Constant) 6.557 6.187 1.060 .293
Seniority .801 .155 .388 5.171 .000 .986 1.014
Test1 .300 .112 .271 2.693 .009 .550 1.819
Test2 .086 .135 .080 .640 .524 .355 2.816
Test3 .407 .154 .352 2.638 .010 .313 3.197

Model 3 Summary
Adjusted Std. Error
R R of the Durbin-
Model R Square Square Estimate Watson
1 .762 .581 .565 7.014 1.891
a Predictors: (Constant), Test3, Seniority, Test1
b Dependent Variable: JobPerf

Model 3 ANOVA
12

Sum of Mean
Model Squares Df Square F Sig.
1 Regression 5187.803 3 1729.268 35.148 .000
Residual 3739.197 76 49.200
Total 8927.000 79

Model 3 Coefficients
Standardize
Unstandardized d Collinearity
Model Coefficients Coefficients t Sig. Statistics
Std. Toleranc
B Error Beta e VIF
1 (Constant) 7.893 5.801 1.361 .178
Seniority .793 .154 5.157 .000 .993 1.008
Test1 .312 .110 2.844 .006 .565 1.771
Test3 .473 .114 4.145 .000 .567 1.764

Model 4 Summary
Adjusted Std. Error
R R of the Durbin-
Model R Square Square Estimate Watson
1 .757 .574 .562 7.031 1.843
a Predictors: (Constant), AvgScore, Seniority
b Dependent Variable: JobPerf

Model 4 ANOVA
Sum of Mean
Model Squares Df Square F Sig.
1 Regression 5120.011 2 2560.006 51.779 .000
Residual 3806.989 77 49.441
Total 8927.000 79

Model 4 Coefficients
Standardize
Mo Unstandardized d Collinearity
del Coefficients Coefficients t Sig. Statistics
Std. Toleranc
B Error Beta e VIF
1 (Constant) 5.407 6.010 .900 .371
Seniority .821 .154 .398 5.339 .000 .997 1.003
AvgScore .782 .094 .623 8.362 .000 .997 1.003
13

Use the information given above to answer the following questions. Specify the model(s) you
use to draw your conclusions, where relevant.

a) Can it be concluded that performance rating improves with length of stay with the company
(Seniority), irrespective of the original test scores? Select the appropriate model to answer the
question. (3 points)

Ans. Here we use Model 1 to test the hypothesis since test scores are ignored.
H0 : β1<= 0 vs. H1 : β1 > 0 (One-sided t-test)
S.E(β1) = Se/((n-1)*SSx)^(.5) = 9.651/(79*25)^(.5) = 0.217
So the t-statistic value will be given by -
tcalc= β1 – 0/S.E(β1) = .891/.217 = 4.103
From the table we get the value of t as – t.05,79 = 1.665
So, tcalc> t.05,78.
So, we reject H0 and hence conclude that performance rating increases with length
of stay with the company.

b) Predict the average performance rating for a worker who has 15 months of Seniority. What
are the highest and lowest performance ratings that this worker is likely to get at 90%
confidence level? (3 Points)

Ans. To predict the average performance rating of a worker with 15 months of experience:
AvgY = b0 + b1 * Seniority = 48.928 + .891*15 = 62.293
Hence, the average performance rating of the worker is 62.293

Prediction interval at 90% confidence level (note: PI for individual, not average)
P.I = (Y- t.05,78 *Se *(1+(1/n)+(X-𝑋̅)2/SSx )^(.5) , Y+ t.05,78 * Se *(1 +(1/n)+(X -𝑋̅)/SSx )^(.5))
= (62.293 – 1.664 * 9.651*(1+(1/80)+(15-18.89)2/(79*25))^(.5) ,
62.293 + 1.664 * 9.651*(1+(1/80)+(15-18.89)2 /(79*25))^(.5))
= (62.293 – 16.203 , 62.293 + 16.203 )
= ( 46.09 , 78.513)
So the highest and lowest performance ratings are: 78.513 and 46.09.

c) If Test 2 was used to predict performance scores on its own, is it likely to be a significant
predictor of JobPerf? Justify. In the presence of other 3 variables is it a significant predictor.
Why or not why not? (3 Points)

Ans. We can see from the correlation matrix that Corr(Test 2, JobPerf) = 0.52.
So if we build a model using Test 2 then it will have R2 = .522 = .2704 i.e. it will
explain 27% of the variation in the model which is satisfactory. So, we can conclude
that a significant relationship between Job Performance and Test 2 is likely.
From model 3, we see that in presence of other 3 variables, coeff of Test2 has a p-
14

value = .524, indicating that in presence of the other variables, Test 2 is not a
significant predictor. The reason for this is Test 2 has high correlation with Test 3
(.8) and Test 1 (.6). The part of the variation explained by Test 2 has largely been
explained by these other variables, leading to Test 2 becoming insignificant.

d) Can it be concluded that employees with higher average scores on the tests stay longer with
the company? Choose the appropriate models to compare. (3 points)

Ans. Here we consider Model 1 and Model 4. We use omitted variable bias to
conclude.
Model 1: 48.928 + 0.891*Seniority
Model 4: 5.407 + 0.821*Seniority + 0.782*AvgScore
The formula for omitted variable bias is given by -
α1 = β1 + β2 * Cov(X1,X2) / Var(X1) where, X1=Seniority , X2=AvgScore
Now , we are given α1 =0.891 , β1 = 0.821 , β2 = 0.782
Therefore,
Cov(X1,X2) / Var(X1) = (.891-.821)/.782 = .07/.782 > 0
Also, Var(X1) > 0 always.
=> Cov(X1,X2) > 0 i.e. X1 and X2 are positively correlated.
So, we can conclude that employees with higher average scores stay longer with the
company.

e) Two employees whose seniority differs by 5 months have the same average test score. Can it
be concluded that the performance rating of the more senior employee will be at least 3 points
higher at 5% significance level? (3 Points)

Ans. Here, the seniority differs by 5 months and we have to test if performance ratings
changes by at least 3 points.
So, in one month the rating would have to change by (3/5) = 0.6 points.
The hypothesis:
H0 : β1< 0.6 vs. H1 : β1 >= 0.6 right tailed t-test
The t-statistic is computed as -
tcalc = (0.821 – 0.6) / 0.154 = 0.221 / 0.154 = 1.435
From the t table we get : t.05,77 = 1.664
Hence tcalc < t.05,77
Therefore, we cannot reject the null hypothesis.
Hence, we cannot conclude that given a seniority difference 5 months, the
performance rating of the more senior employee will differ by at least 3 points.
15

Question 3 (20 Points)

A data analytics start up works with political parties during elections. They have got access to
voting patterns from various official sources. They are trying to understand how the percent of
votes obtained by the winner is determined. As a first cut they are using the following data:

% VOTES – the percent of votes polled obtained by the winning candidate

MARGIN – the margin of victory measured in number of votes
Gender – 1 is for Men and 0 for women
College – 1 is for college educated winners and 0 for those who did not go to college.
They run the regression for all 543 elected MPs. The model output is provided below (with few
missing information):
Table 3.1
Regression
Statistics
Multiple R 𝟎. 𝟕𝟐𝟖𝟖𝟎𝟗
R Square 𝟎. 𝟓𝟑𝟏𝟏𝟔𝟑
Adjusted R Square 𝟎. 𝟓𝟐𝟖𝟓𝟓𝟑
Standard Error 𝟓. 𝟔𝟑𝟑𝟐
Observations 543

Table 3.2
ANOVA
Significance
Df SS MS F F
Regression 3 𝟏𝟗𝟑𝟕𝟕. 𝟖𝟑 6459.2667 203.5514
Residual 539 17104.06 31.7329
Total 542 36481.89

Table 3.3 Coefficients

Standard
Coefficients Error t Stat P-value Lower 95% Upper 95%
Intercept 38.59235 0.937225 𝟒𝟏. 𝟏𝟕𝟕𝟐 36.75129 40.4334106
MARGIN 5.32E-05 2.18E-06 𝟐𝟒. 𝟒𝟎𝟑𝟔 4.89E-05 5.7463E-05
Gender 1.551306 0.777806 𝟏. 𝟗𝟗𝟒𝟒𝟔 0.023404 3.07920835
College -1.47506 0.586995 −𝟐. 𝟓𝟏𝟐𝟗 -2.62814 -0.3219783

(i) Fill up the Tables 3.1, 3.2 and 3.3 above (except the p values and the Significance
F values). Clearly write all the steps. [10 points]
16

SOLUTION
STEP I:
We first fill the data of Table 3.3. We need to standardize the coefficients which means we have
to find the standard deviation of the response variable Y (= %age of votes polled to winning
candidate).

𝑆𝑆𝑌 36481.89
𝑆𝑌 = √ =√ = 8.20425
𝑁−1 543 − 1

Observing that 𝑆𝐸(𝛽̂𝑖 ) = 𝑆𝑥 for each respective variable, we have the t-stat values:
38.59235
𝑡𝛽𝑜 = = 41.177252
0.937225
5.32E − 05
𝑡𝛽1 = = 24.40367
2.18E − 06
1.551306
𝑡𝛽2 = = 1.99446
0.777806
−1.47506
𝑡𝛽3 = = −2.5129
0.586995
STEP II:
We next fill Table 3.2. We have
𝑆𝑆𝐸 = 17104.06
𝑆𝑆𝑇 = 36481.89
so that 𝑆𝑆𝑅 = 𝑆𝑆𝑇 − 𝑆𝑆𝐸 = 19377.83. The regression degrees of freedom is 4 − 1 = 3 and the
residual degrees of freedom is the total minus this 3 so that the residual degrees of freedom is
539. Hence, the Mean-Squared Values are the respective squared-sums divided by their degrees
of freedom:
𝑆𝑆𝑅 19377.83
𝑀𝑆𝑅𝑒𝑔𝑟𝑒𝑒𝑠𝑖𝑜𝑛 = 𝑀𝑆𝑅 = 𝑑𝑓 = = 6459.2667 and
𝑅 3

𝑆𝑆𝐸 17104.06
𝑀𝑆𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙 = 𝑀𝑆𝐸 = 𝑑𝑓 = = 31.732949.
𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 539

Thus, the F-value for the model is

𝑀𝑆𝑅 6459.2667
𝐹𝑐𝑎𝑙,3,539 = 𝑀𝑆𝐸 = 31.732949 = 203.551.
17

STEP III:
We now fill table 3.1. We have:
𝑆𝑆𝑅 19377.83
𝑅2 = = = 0.531163
𝑆𝑆𝑇 36481.89

𝑀𝑢𝑙𝑡𝑖𝑝𝑙𝑒 − 𝑅 = √𝑅 2 = √0.531163 = 0.728809

2
(1 − 𝑅 2 )(𝑁 − 1) (1 − 0.531163)(542)
𝐴𝑑𝑗𝑢𝑠𝑡𝑒𝑑 − 𝑅 = 1 − =1− = 0.528553
𝑁−𝑘−1 543 − 3 − 1
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐸𝑟𝑟𝑜𝑟 (𝑜𝑓 𝑡ℎ𝑒 𝑚𝑜𝑑𝑒𝑙) = √𝑀𝑆𝐸 = √31.732949 = 5.6332
This competes this part.

(i) Assuming that t is significant for any value greater than 1.964 at 5%, are the
variables (margin, gender and college) significant? [2 points]
SOLUTION
From (i), we have,
𝑡𝛽1 = 𝑡𝑚𝑎𝑟𝑔𝑖𝑛 = 24.40367 > 1.96

so that the variable “margin” is significant at 5%.

Next,
𝑡𝛽2 = 𝑡𝑔𝑒𝑛𝑑𝑒𝑟 = 1.99446

𝑡𝛽3 = 𝑡𝑐𝑜𝑙𝑙𝑒𝑔𝑒 = −2.5129

so that “gender” is significant and again “college” is significant.

(ii) Assuming that the critical value of F is 2.621 at 5% significance, is the overall
regression significant? [2 points]

SOLUTION
Overall Regression is SIGNIFICANT because

𝑀𝑆𝑅 6459.2667
𝐹𝑐𝑎𝑙,3,539 = = = 203.551 > 2.621 at 5% significance level.
𝑀𝑆𝐸 31.732949
18

The analytics firm decides to dig a little deeper and looks at two outlying states, UP and AP, one
of which has significantly lower assets per winner and the other significantly higher. Both the
new variables are 0-1 variables. The values for some of the regressions are given below (Table
3.4).

Table 3.4 Regression Models with Corresponding R-Square

Regression
Model Independent Variables R2
1 MARGIN
2 MARGIN, Gender 0.52567
3 MARGIN, Gender, College 0.531163
MARGIN, Gender, College,
4 UP 0.56051
MARGIN, Gender, College,
5 UP, AP 0.581339

(iii) What is the part correlation for College and % of votes in Regression model 3? [2
points]
SOLUTION
(𝑃𝑎𝑟𝑡 − 𝐶𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑓𝑜𝑟 𝐶𝑜𝑙𝑙𝑒𝑔𝑒)2 = 𝑅 2 3 − 𝑅 2 2 = 0.531163 − 0.52567 = 0.005493
and hence,

𝑃𝑎𝑟𝑡 − 𝐶𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑓𝑜𝑟 𝐶𝑜𝑙𝑙𝑒𝑔𝑒 = √0.005493 = 0.07411

(iv) Between regression 2 and 5 is it justified to add the additional variables?

[2 points]
SOLUTION
Partial-F Test :=

(𝑅 2 𝑁𝑒𝑤 −𝑅 2 𝑂𝑙𝑑 )
𝑁𝑜.𝑜𝑓 𝑁𝑒𝑤 𝑅𝑒𝑔𝑟𝑒𝑠𝑠𝑜𝑟𝑠 (0.581339 − 0.52567)/3
𝐹𝑐𝑎𝑙 = (1−𝑅 2 𝑁𝑒𝑤 )
= (1−0.581339)
= 23.80148
𝑁−𝑘−1 543−5−1

> 2.621 at 5% significance level. Thus, it is justified to add the new variables.
19

Regression model 5 in Table 3.4 has a standard error of 5.333135, an overall F value of 149.1324
with significance of 4.4x10-99 . The standard deviation for the dependent variable is 8.204253.
The values of standard deviation for the dependent and independent variables are given below
(Table 3.5).

Table 3.5
Standard
Coefficients deviation
Intercept 38.56993
MARGIN 5.58E-05 111365.7
Gender 1.498308 0.311494
College -1.53774 0.412796
UP -3.71439 0.354761
AP 5.715821 0.209766

(v) Which variable has the greatest impact on Voting % ?

SOLUTION
To compare the relative impact, we have to standardise these coefficients. The formula is
𝛽
𝛽̂ = 𝑆𝐸𝑌 ×
𝑆𝐸𝛽

and here, 𝑆𝐸𝑌 = 5.6332 so that the values of the standardised coefficients are

Standard Standardised
Coefficients deviation Coefficients
Intercept 38.56993
MARGIN 5.58E-05 111365.7 2.8225e-9
Gender 1.498308 0.311494 27.09608
College -1.53774 0.412796 -20.9846
UP -3.71439 0.354761 58.98027
AP 5.715821 0.209766 153.4965

Hence, AP has the greatest impact on voting percentage.

[2 points]

Project (Burj Al Arab) Updated
78% (23)
Project (Burj Al Arab) Updated
129 pages
Homework 1
0% (1)
Homework 1
8 pages
V7 Adobe Acrobat Pro DC 2018 (11 - 04-11 - 10) (11 - 25)
50% (2)
V7 Adobe Acrobat Pro DC 2018 (11 - 04-11 - 10) (11 - 25)
6 pages
DS II Mid Term 2017 Solution
No ratings yet
DS II Mid Term 2017 Solution
20 pages
Significance of The Stochastic Disturbance Term
No ratings yet
Significance of The Stochastic Disturbance Term
5 pages
STAT2215FINALSEF24
No ratings yet
STAT2215FINALSEF24
9 pages
Public Health, Health Economics, Regression Analysis
No ratings yet
Public Health, Health Economics, Regression Analysis
22 pages
Regression With Stata Chapter 2 - Regression Diagnostics PDF
No ratings yet
Regression With Stata Chapter 2 - Regression Diagnostics PDF
57 pages
2023 Past Year Question Paper
No ratings yet
2023 Past Year Question Paper
6 pages
ECON1313
No ratings yet
ECON1313
13 pages
Shell Regression
No ratings yet
Shell Regression
16 pages
Quantitative Methods Ii Quiz 1: Saturday, October 23, 2010
No ratings yet
Quantitative Methods Ii Quiz 1: Saturday, October 23, 2010
14 pages
Regression Explaination
No ratings yet
Regression Explaination
2 pages
Eco 810
No ratings yet
Eco 810
4 pages
Assignment 2 Answer Key PDF
No ratings yet
Assignment 2 Answer Key PDF
5 pages
Week 07 - in Class Midterm Review Problems
No ratings yet
Week 07 - in Class Midterm Review Problems
3 pages
Revision Guideline and Solved Problems JAN2018
No ratings yet
Revision Guideline and Solved Problems JAN2018
24 pages
10 - Regression - Explained - SPSS - Important For Basic Concept
No ratings yet
10 - Regression - Explained - SPSS - Important For Basic Concept
23 pages
1
No ratings yet
1
5 pages
Econometrics Project
100% (1)
Econometrics Project
10 pages
Final Exam 2020 Online V1
No ratings yet
Final Exam 2020 Online V1
6 pages
Summary Output: Regression Statistics
No ratings yet
Summary Output: Regression Statistics
6 pages
Prac 3
No ratings yet
Prac 3
8 pages
Eco 313 2024 Exam & Memo
No ratings yet
Eco 313 2024 Exam & Memo
9 pages
Basic
No ratings yet
Basic
4 pages
Ba9201 - Statistics For Managementjanuary 2010
100% (2)
Ba9201 - Statistics For Managementjanuary 2010
5 pages
20 Diff Districts II PU Stats Prep QPs 2024.
No ratings yet
20 Diff Districts II PU Stats Prep QPs 2024.
73 pages
Regression Explained SPSS
No ratings yet
Regression Explained SPSS
25 pages
Econometrics Assignment
No ratings yet
Econometrics Assignment
5 pages
Econometrics For Finance
100% (1)
Econometrics For Finance
54 pages
BSE 2103 - Introductory Econometrics - July 2022
No ratings yet
BSE 2103 - Introductory Econometrics - July 2022
11 pages
Reference Paper 24th August
No ratings yet
Reference Paper 24th August
18 pages
Econ 306 HW 3
No ratings yet
Econ 306 HW 3
7 pages
Basic Econometrics 2023 Question Paper With Solution Delhi University BBE Business Economics
No ratings yet
Basic Econometrics 2023 Question Paper With Solution Delhi University BBE Business Economics
7 pages
Stats 2015 To 2020
No ratings yet
Stats 2015 To 2020
14 pages
Lab Report
No ratings yet
Lab Report
65 pages
ACC233, FIN233-Statistics For Accounting and Finance 2015, 2017-2019-2019-141
No ratings yet
ACC233, FIN233-Statistics For Accounting and Finance 2015, 2017-2019-2019-141
16 pages
Results
No ratings yet
Results
11 pages
hw2 Spring2023 Econ3005 Solution
No ratings yet
hw2 Spring2023 Econ3005 Solution
10 pages
mt1 2017 Soln
No ratings yet
mt1 2017 Soln
8 pages
2022bbe1052 Ecotrix Merged
No ratings yet
2022bbe1052 Ecotrix Merged
18 pages
RM Unit 4 - Overview
No ratings yet
RM Unit 4 - Overview
62 pages
Notes On Applied Statistics
No ratings yet
Notes On Applied Statistics
16 pages
Studenmund Top1.107
No ratings yet
Studenmund Top1.107
10 pages
ECO311 Practice Questions 1
No ratings yet
ECO311 Practice Questions 1
5 pages
Oversikt ECN402
No ratings yet
Oversikt ECN402
40 pages
Enero2022 Solutions
No ratings yet
Enero2022 Solutions
3 pages
Simple Linear Regression Interpretation PDF
No ratings yet
Simple Linear Regression Interpretation PDF
2 pages
Econ 1005 Final Exam Sem I 2014-2015
No ratings yet
Econ 1005 Final Exam Sem I 2014-2015
6 pages
Linear Regression Model
No ratings yet
Linear Regression Model
15 pages
Unit 3
No ratings yet
Unit 3
24 pages
Applications Spring 2024
No ratings yet
Applications Spring 2024
14 pages
Statistics July 2009 Eng
No ratings yet
Statistics July 2009 Eng
7 pages
Econ 3049: Econometrics: Department of Economics The University of The West Indies, Mona
No ratings yet
Econ 3049: Econometrics: Department of Economics The University of The West Indies, Mona
16 pages
Disha Ba
No ratings yet
Disha Ba
8 pages
08 Test
0% (1)
08 Test
11 pages
Assignment SPSS Word2
No ratings yet
Assignment SPSS Word2
17 pages
Econometrics
No ratings yet
Econometrics
12 pages
It Skills and Data Analysis Group Project
No ratings yet
It Skills and Data Analysis Group Project
10 pages
Regression and Life Cycle Costing
No ratings yet
Regression and Life Cycle Costing
28 pages
IGNOU BCA Statistical Techniques Previous Year Unsolved Papers BCS 040
From Everand
IGNOU BCA Statistical Techniques Previous Year Unsolved Papers BCS 040
Manish Soni
No ratings yet
Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value
From Everand
Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value
Wouter Verbeke
No ratings yet
With Conviction: Moving Ahead
No ratings yet
With Conviction: Moving Ahead
155 pages
Colgate 2021 AR Excel Financials For AR Web Site FR FINAL
No ratings yet
Colgate 2021 AR Excel Financials For AR Web Site FR FINAL
5 pages
Growth 101
No ratings yet
Growth 101
49 pages
Startups and Companies List
No ratings yet
Startups and Companies List
18 pages
Product Teardown: New User Onboarding.: Nextleap: Learn in Public Challenge (1/6)
No ratings yet
Product Teardown: New User Onboarding.: Nextleap: Learn in Public Challenge (1/6)
10 pages
Playbook For Guaranteed Success in Product Assignment Rounds
No ratings yet
Playbook For Guaranteed Success in Product Assignment Rounds
26 pages
Growth 101
No ratings yet
Growth 101
48 pages
Product Teardown: New User On-Boarding: Learn in Public Challenge (1/8)
No ratings yet
Product Teardown: New User On-Boarding: Learn in Public Challenge (1/8)
8 pages
Transportation: GSTIN 27AACCC6016B1Z8 Description of Service: Reservation Services For
No ratings yet
Transportation: GSTIN 27AACCC6016B1Z8 Description of Service: Reservation Services For
1 page
Covering Letter
No ratings yet
Covering Letter
188 pages
Writing Strong Resume Bullets - 2016 - Final PDF
No ratings yet
Writing Strong Resume Bullets - 2016 - Final PDF
1 page
I Turned 40 This Week, Here Is A List of Things I Have Failed At. My Failure Resume!
No ratings yet
I Turned 40 This Week, Here Is A List of Things I Have Failed At. My Failure Resume!
36 pages
Fin Fastrack: Session On Markets
No ratings yet
Fin Fastrack: Session On Markets
33 pages
Web Ready1
No ratings yet
Web Ready1
59 pages
A World of Art Exam Chapter 1
No ratings yet
A World of Art Exam Chapter 1
7 pages
Cue Words Relaxation
No ratings yet
Cue Words Relaxation
4 pages
Bab 9 Akm
No ratings yet
Bab 9 Akm
44 pages
JDBC Drivers JDBC-ODBC Bridge Driver Native-API Driver Network Protocol Driver Thin Driver
No ratings yet
JDBC Drivers JDBC-ODBC Bridge Driver Native-API Driver Network Protocol Driver Thin Driver
8 pages
QMS M1
No ratings yet
QMS M1
10 pages
ONDC - Sept 2022
No ratings yet
ONDC - Sept 2022
16 pages
About The Author: Fabio Saccomanno Was Born in Genoa, Italy in 1933. He Received The Laurea
No ratings yet
About The Author: Fabio Saccomanno Was Born in Genoa, Italy in 1933. He Received The Laurea
2 pages
Mandelbrot Zoom Report
No ratings yet
Mandelbrot Zoom Report
9 pages
A First Book Nature UK Part4
100% (1)
A First Book Nature UK Part4
13 pages
O Level Forces
No ratings yet
O Level Forces
16 pages
Cor Jesu College, Inc. College of Health Sciences: Infographic Competition
No ratings yet
Cor Jesu College, Inc. College of Health Sciences: Infographic Competition
3 pages
Jim Richardson On The Kartilya 1
No ratings yet
Jim Richardson On The Kartilya 1
17 pages
Dismantling Naik
No ratings yet
Dismantling Naik
45 pages
Taj Wellington Mews - Tri Fold Brochure
No ratings yet
Taj Wellington Mews - Tri Fold Brochure
2 pages
Spectrele Lui Marx - Derrida PDF
100% (1)
Spectrele Lui Marx - Derrida PDF
35 pages
2 新车准备
No ratings yet
2 新车准备
7 pages
Black Death
No ratings yet
Black Death
34 pages
Loan Approval Prediction System Using Machina Learning
No ratings yet
Loan Approval Prediction System Using Machina Learning
4 pages
U2000 Northbound Performance File Interface Developer Guide (NE-Based)
No ratings yet
U2000 Northbound Performance File Interface Developer Guide (NE-Based)
79 pages
Pietro Lunardi
No ratings yet
Pietro Lunardi
5 pages
To From
No ratings yet
To From
4 pages
Portfolio Management in Kotak Securites
0% (1)
Portfolio Management in Kotak Securites
92 pages
C-Data Gepon Olt Fd2000s Ems User Manual-V2.0
No ratings yet
C-Data Gepon Olt Fd2000s Ems User Manual-V2.0
67 pages
Mission 1 Stage 1 Copywriting
No ratings yet
Mission 1 Stage 1 Copywriting
3 pages
Rain Industries Limited Investor Presentation
No ratings yet
Rain Industries Limited Investor Presentation
14 pages
Air 0 Is The Next
No ratings yet
Air 0 Is The Next
2 pages
CSC10004: Data Structures and Algorithms
No ratings yet
CSC10004: Data Structures and Algorithms
20 pages
Gautama Buddha Was Born in Hela Bima
33% (3)
Gautama Buddha Was Born in Hela Bima
62 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

DS II Mid Term 2017 Solution

Uploaded by

DS II Mid Term 2017 Solution

Uploaded by

1

Question 1 (20 points)

Table 1. Data Dictionary

Table 2 Descriptive Statistics

Table 4 Model Summaryb

Figure 1 Normal Probability Plot Figure 2 Residual Plot

Question 1.2 (1 point)

Question 1.3 (2 points)

Question 1.4 (1 Point)

Question 1.5 (3 Points)

SSX = (n-1)* SD2 = 8076.23

Minimum average value = 25552.6836

Figure 3 Normal Probability Plot

Question 1.7(2 points)

Question 1.8 (2 Points)

Question 1.9 (2 points)

F < F-critical, we do not reject null hypothesis.

Question 1.10 (2 points)

Table 9 Excluded Variablesa

Question 1.11 (2 points)

F (gini) = 2.3215 F (CS) = 8.0543

F-critical (0.05, 1, 18) = 4.414

F (gini) < F-critical

F(CS) > F-critical

Question 2 (15 points)

Model Unstandardized Standardized t Sig.

B Std. Error Beta

Question 3 (20 Points)

% VOTES – the percent of votes polled obtained by the winning candidate

Table 3.3 Coefficients

Thus, the F-value for the model is

𝑀𝑢𝑙𝑡𝑖𝑝𝑙𝑒 − 𝑅 = √𝑅 2 = √0.531163 = 0.728809

so that the variable “margin” is significant at 5%.

𝑡𝛽3 = 𝑡𝑐𝑜𝑙𝑙𝑒𝑔𝑒 = −2.5129

so that “gender” is significant and again “college” is significant.

Table 3.4 Regression Models with Corresponding R-Square

𝑃𝑎𝑟𝑡 − 𝐶𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑓𝑜𝑟 𝐶𝑜𝑙𝑙𝑒𝑔𝑒 = √0.005493 = 0.07411

(iv) Between regression 2 and 5 is it justified to add the additional variables?

(v) Which variable has the greatest impact on Voting % ?

Hence, AP has the greatest impact on voting percentage.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.