0% found this document useful (0 votes)
75 views23 pages

Solutions To Lab 2 Statistical Inference: Nataliia Ostapenko

This document summarizes the solutions to questions from a statistical inference homework. Key findings include: 1) The linear projection of x on y is estimated to be 14, with an intercept of -345. 2) T-tests on regression coefficients fail to reject the null hypotheses that β0=0 and β1=1. 3) Omitted variable bias is estimated to be positive. Joint hypothesis testing rejects the null that all coefficients equal their hypothesized values. 4) Regression results are summarized, including an R-squared value of 0.317.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views23 pages

Solutions To Lab 2 Statistical Inference: Nataliia Ostapenko

This document summarizes the solutions to questions from a statistical inference homework. Key findings include: 1) The linear projection of x on y is estimated to be 14, with an intercept of -345. 2) T-tests on regression coefficients fail to reject the null hypotheses that β0=0 and β1=1. 3) Omitted variable bias is estimated to be positive. Joint hypothesis testing rejects the null that all coefficients equal their hypothesized values. 4) Regression results are summarized, including an R-squared value of 0.317.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Solutions to lab 2 Statistical Inference

Nataliia Ostapenko

24 November 2017
Question 1 from Homework

I 1. E (y ) = E (z − y ) = E (z) − E (y ) = 30 − 5 = 25
I 2. Var (x ) = E (x 2 ) − (E (x ))2 = 80 − 52 = 55
I 3. Cov (zy ) = E (zy ) − E (z) ∗ E (y ) = 1500 − 750 = 750
I 4. Var (x ) = Var (y ) + Var (x ) − 2 ∗ Cov (z, y )
I I 55 = 50 + Var (z) − 1500
I I Var (z) = 1505
Question 1 from Homework. Linear projection

Cov (x , y )
I β= since we are projecting x on y!
Var (y )
I Cov (x , y ) = E (xy ) − E (x )E (y ) but we don’t have E (xy ), lets
rewrite
I = E ((z − y )y ) − E (z − y )E (y ) by the definition of x
I = E (zy ) − E (y 2 ) − E (z)E (y ) + (E (y ))2 where first and third
is Cov (z, y ) and second and fo urth is −Var (y )
I Cov (x , y ) = Cov (z, y ) − Var (y )
Question 1 from Homework. Linear projection continue

Cov (z, y ) − Var (y )


I Therefore β =
Var (y )
Cov (z, y ) 750
I β= −1= − 1 = 14
Var (y ) 50
I β0 = E (x ) − E (y ) ∗ β = 5 − 25 ∗ 14 = −345
I Linear projection is L(x |1, y ) = β0 + y ∗ β = −345 + y ∗ 14
Question 3 from Homework (a). T-test

I 1. H0 : β0 = 0, H1 : β0 ! = 0
ˆ 0 − β0
beta
I t − stat =
s.e.(βˆ0 )
−12.95 − 0
I t − stat = = −0.91
14.23
I t95%critical = 1.987 from the statistical table
I |t| is not higher than t95%critical and we cannot reject H0
Question 3 from Homework (a). T-test

I 2. H0 : β1 = 1, H1 : β1 ! = 1
ˆ 1 − β1
beta
I t − stat =
s.e.(βˆ1 )
0.886 − 1
I t − stat = = −1.34
0.085
I t95%critical = 1.987 from the statistical table
I |t| is not higher than t95%critical and we cannot reject H0
Question 3 from Homework (b). Direction of the Bias

I When we have ommited variable bias the direction of the bias is


I Bias=effect of ommited variable on the dependent*correlation
between ommited variable a nd independent variable in the
regression
I Bias = β2 ∗ corr (assess, sqrft)
I Bias = (+) ∗ (+) = + we have positive bias here
Question 3 from Homework (c). Testing Joint Hypothesis

I H0 : β0 = 0 and β1 = 1, H1 : β0 ! = 0 or β1 ! = 1
(SSRr − SSRur )/numberofrestrictions
I F =
SSRur /df = (n − k − 1)
(208349.11 − 144323.88)/(94 − 92) 32012.615
I = = = 20.41
144323.88/92 1568.7378
I F (2, 92)95%critical = 3.10 from the statistical table
I F > F (2, 92)95%critical we reject H0
Question 3 from Homework (d). Testing Joint Hypothesis

I NB now unrestricted model is the model presented in part (d)


I H0 : β2 = 0 and β3 = 0 and β4 = 0
I H1: at least one is non zero
(R 2 ur − Rr2 )/numberofrestrictions
I F = 2 /df = (n − k − 1)
1 − Rur
(0.895 − 0.871)/3 0.025/3
I = = = 7.13
1 − 0.895/89 0.104/89
I F (3, 89)95%critical = 2.71 from the statistical table
I F > F (3, 89)95%critical we reject H0
Question 3 from Homework (e). Heteroscedastic errors

I F-test in not F-distributed as well as t-test is not t-distributed!


I Hypothesis testing is invalid
I But coefficients are still unbiased
Question 3 from Homework (f). Multicollinearity

I if there is not perfect multicollinearity it will increse standard


errors in r egression
I t − statistics for coefficients might be insignificant while
F − test is significant
I if there is a perfect multicollinearity we cannot identify the
coefficients
Question 4 from Homework

RSS 4381.53
I ResidualMS = = = 0.15596
√ df 28094√
I RootMSE = ResidualMS = 0.15596 = 0.3979
ModelSS 2033.3
I R2 = = = 0.3170
TotalSS 6414.82
R 2 /k 0.3170/4
I F (4, 28094) = 2
= =
(1 − R )/(n − k − 1) 0.6830/28094
3259.32
Question 4 from Homework

Figure 1: Answers
Exercise 3.14

. use hprice1.dta, clear

. regress price sqrft bdrms

Source | SS df MS Number of obs = 88


-------------+---------------------------------- F(2, 85) = 72.96
Model | 580009.152 2 290004.576 Prob > F = 0.0000
Residual | 337845.354 85 3974.65122 R-squared = 0.6319
-------------+---------------------------------- Adj R-squared = 0.6233
Total | 917854.506 87 10550.0518 Root MSE = 63.045

------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sqrft | .1284362 .0138245 9.29 0.000 .1009495 .1559229
bdrms | 15.19819 9.483517 1.60 0.113 -3.657582 34.05396
_cons | -19.315 31.04662 -0.62 0.536 -81.04399 42.414
------------------------------------------------------------------------------
Exercise 3.14 (a)

I NB the price of the room is showed in 1000$


I price = -19.3 + 0.13sqrft + 15.2bdrms
. describe price

storage display value


variable name type format label variable label
------------------------------------------------------------------------------------------
price float %9.0g house price, $1000s

. regress price sqrft bdrms

Source | SS df MS Number of obs = 88


-------------+---------------------------------- F(2, 85) = 72.96
Model | 580009.152 2 290004.576 Prob > F = 0.0000
Residual | 337845.354 85 3974.65122 R-squared = 0.6319
-------------+---------------------------------- Adj R-squared = 0.6233
Total | 917854.506 87 10550.0518 Root MSE = 63.045

------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sqrft | .1284362 .0138245 9.29 0.000 .1009495 .1559229
bdrms | 15.19819 9.483517 1.60 0.113 -3.657582 34.05396
_cons | -19.315 31.04662 -0.62 0.536 -81.04399 42.414
------------------------------------------------------------------------------
Exercise 3.14 (b), (c), (d)

I 2. 15200$ is estimated increase in price for a room with one more


bedroom.
I 3. the estimated increase in in price for a room with one more
bedroom and 140 sq feet in size is
(15.2 + 0.13 ∗ 140) ∗ 1000$ = 33400$
I I the estimated incrase in price is 2.2 times higher than in (b)
I 4. 63.2%
. display -19.3+0.132438+15.24 358.44
Exercise 3.14 (e), (f)

I price = (−19.3 + 0.13 ∗ 2438 + 15.2 ∗ 4) ∗ 1000$ = 358440$


I 5. price=358440$
I 6. the residual is 358440 − 300000 = 58440$
I 6. he underpaid 58440$
I But there are many other features of a house (ommited
variable?) that affect price, an d we have not controlled for
these.
. display 358440-300000
58440
Exercise 4.14 (a)

I lprice = 4.77 + 0.0003 ∗ sqrft + 0.03 ∗ bdrms


I θ = 150 ∗ 0.000379 + 0.0289 = 0.0858
I which means that an additional 150 square foot bedroom
increases the predicted price by about 8.6%.
. regress lprice sqrft bdrms

Source | SS df MS Number of obs = 88


-------------+---------------------------------- F(2, 85) = 60.73
Model | 4.71671468 2 2.35835734 Prob > F = 0.0000
Residual | 3.30088884 85 .038833986 R-squared = 0.5883
-------------+---------------------------------- Adj R-squared = 0.5786
Total | 8.01760352 87 .092156362 Root MSE = .19706

------------------------------------------------------------------------------
lprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sqrft | .0003794 .0000432 8.78 0.000 .0002935 .0004654
bdrms | .0288844 .0296433 0.97 0.333 -.0300543 .0878232
_cons | 4.766027 .0970445 49.11 0.000 4.573077 4.958978
------------------------------------------------------------------------------

. display 150*0.000379 + 0.0289


.08575
Exercise 4.14 (b)
I β2 = θ1 − 150 ∗ β1
I lprice = β0 + β1 ∗ sqrft + (θ1 − 150 ∗ β1 ) ∗ bdrms
I lprice = β0 + β1 ∗ sqrft + θ1 ∗ bdrms − 150 ∗ β1 ∗ bdrms
I lprice = β0 + β1 ∗ (sqrft − 150 ∗ bdrms) + θ1 ∗ bdrms
I we need to generate sqrft-150*bdrms and insert it into the
regression
. generate new=sqrft-150*bdrms

. regress lprice new bdrms

Source | SS df MS Number of obs = 88


-------------+---------------------------------- F(2, 85) = 60.73
Model | 4.71671468 2 2.35835734 Prob > F = 0.0000
Residual | 3.30088884 85 .038833986 R-squared = 0.5883
-------------+---------------------------------- Adj R-squared = 0.5786
Total | 8.01760352 87 .092156362 Root MSE = .19706

------------------------------------------------------------------------------
lprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
new | .0003794 .0000432 8.78 0.000 .0002935 .0004654
bdrms | .0858013 .0267675 3.21 0.002 .0325804 .1390223
_cons | 4.766027 .0970445 49.11 0.000 4.573077 4.958978
------------------------------------------------------------------------------
Exercise 4.14 (c)

I θ = 0.0858
I NB θ is not the coefficient of the new variab le, but the
coefficient of bdrms see the equation!
I s.e. = 0.027
I CI = 0.0858 + (−)0.027 ∗ 1.987
I CI = [0.032151; 0.139449]
. display 0.0858+0.027*1.987
.139449

. display 0.0858-0.027*1.987
.032151
Exercise 4.17

. use WAGE2.dta, clear

. regress lwage educ exper tenure

Source | SS df MS Number of obs = 935


-------------+---------------------------------- F(3, 931) = 56.97
Model | 25.6953242 3 8.56510806 Prob > F = 0.0000
Residual | 139.960959 931 .150334005 R-squared = 0.1551
-------------+---------------------------------- Adj R-squared = 0.1524
Total | 165.656283 934 .177362188 Root MSE = .38773

------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0748638 .0065124 11.50 0.000 .062083 .0876446
exper | .0153285 .0033696 4.55 0.000 .0087156 .0219413
tenure | .0133748 .0025872 5.17 0.000 .0082974 .0184522
_cons | 5.496696 .1105282 49.73 0.000 5.279782 5.713609
------------------------------------------------------------------------------
Exercise 4.17
I we need to test H0 : β2 = β3
I lets rewrite H0 : θ = β2 − β3 = 0
I we test it against H1 : θ! = 0
I we can express β2 = θ + beta3
I then rewrite the equation as
I lwage = β0 + β1 ∗ educ + (θ + β3 ) ∗ exper + β3 ∗ tenure
I lwage = β0 + β1 ∗ educ + θ ∗ exper + β3 ∗ (tenure + exper )
I we nee to generate newvariable = tenure + exper and insert it
in regression
. generate new=tenure+exper

. regress lwage educ exper new

Source | SS df MS Number of obs = 935


-------------+---------------------------------- F(3, 931) = 56.97
Model | 25.6953242 3 8.56510806 Prob > F = 0.0000
Residual | 139.960959 931 .150334005 R-squared = 0.1551
-------------+---------------------------------- Adj R-squared = 0.1524
Total | 165.656283 934 .177362188 Root MSE = .38773

------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0748638 .0065124 11.50 0.000 .062083 .0876446
exper | .0019537 .0047434 0.41 0.681 -.0073554 .0112627
new | .0133748 .0025872 5.17 0.000 .0082974 .0184522
Exercise 4.17

I theta=0.0019537
I NB θ is not the coefficient of the new variab le, but the
coefficient of exper see the equation!
I s.e.= 0.0047434
(0.0019537 − 0)
I t= = 0.41
0.0047434
I t95%critical =1.96 from the statistical table
I |t| is not greater than t95%critical -> we cannot reject H0 ->
theta is not stati stically significantly different from 0
I CI also includes 0!

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy