0% found this document useful (0 votes)
48 views9 pages

PS4 Solution

The document contains solutions to problems from an econometrics course. For problem 1, the solution derives the expected value of the dependent variable given the independent variables for a binary choice model. It also notes that the error term is heteroskedastic. For problem 2, the solution outlines a latent theoretical model and corresponding estimable model for binary choice. It notes the marginal effects are not constant. For later problems, it comments on statistical significance of coefficients and overall models based on provided output.

Uploaded by

Guido Gilli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views9 pages

PS4 Solution

The document contains solutions to problems from an econometrics course. For problem 1, the solution derives the expected value of the dependent variable given the independent variables for a binary choice model. It also notes that the error term is heteroskedastic. For problem 2, the solution outlines a latent theoretical model and corresponding estimable model for binary choice. It notes the marginal effects are not constant. For later problems, it comments on statistical significance of coefficients and overall models based on provided output.

Uploaded by

Guido Gilli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Solutions to Problem Set 4

Econometrics (30413)

Spring 2021

Theory Questions

Question 1

Consider a model where the outcome variable y is binary



1

if the individual is graduated
yi =
0

if the individual is not graduated

and we have a linear model explaining y as a function of explanatory variables x

yi = Xi β + εi

a) Assume E(εi | Xi ) = 0, and derive E(yi | Xi ).

b) Is εi normally distributed?

c) Is εi homoskedastic?

d) Does Xi β̂OLS assume values only between 0 and 1?

Solution

a) In this model,

E(yi | Xi ) = E (Xi β + εi | Xi ) = Xi β + E (εi | Xi ) = Xi β

1
Moreover, since y is Bernoulli-distributed,

E(yi | Xi ) = 1 · P (yi = 1 | Xi ) + 0 · P (yi = 0 | Xi ) = P (yi = 1 | Xi )

Therefore,
P (yi = 1 | X) = E(yi | Xi ) = Xi β

b) In this situation, since yi is Bernoulli-distributed, also εi is Bernoulli-distributed



1 − X i β

if yi = 1
εi =
−Xi β

if yi = 0

As a consequence, β̂OLS is not normally distributed in finite samples, but is still asymptoti-
cally normal.

c) εi is heteroskedastic, indeed

var(εi | Xi ) = E (εi − E(εi | Xi ))2 Xi = E ε2i Xi


   

= (1 − Xi β)2 · P (yi = 1 | Xi ) + (−Xi β)2 · P (yi = 0 | Xi )

= (1 − Xi β)2 Xi β + (Xi β)2 (1 − Xi β)

= Xi β (1 − Xi β) [(1 − Xi β) + Xi β] = Xi β (1 − Xi β)

As a consequence,
 
V β̂OLS | X 6= σ 2 (X 0 X)−1

and OLS is no longer BLUE. Therefore, standard errors must be adjusted for heteroskedas-
ticity (with heteroskedasticity-robust standard errors) in order to test the significance of the
parameters.

d) Given that it is a linear fit, Xi β̂OLS is unbounded and can hence assume values lower than
0 or bigger than 1.

2
Question 2

Suppose you have a sample of households and you are interested in determining the variables
which are relevant to the choice of buying a boat.

a) Which type of latent theoretical model can be considered? Which type of estimable model
should we consider?

b) Write down a linear latent theoretical model.

c) Write down the estimable model.

d) Are the marginal effects constant?

Solution

a) We can imagine that there is a latent unobserved variable (willingness to pay for a boat),
such that when the latent unobserved variable falls above a certain threshold (a so-called
reserve price), the boat is purchased, and otherwise not.

By contrast, in the estimable model the dependent variable is binary, capturing observed
buying choices regarding boats, i.e. whether a boat is purchased or not by an individual.

b) Letting yi∗ be the latent unobserved variable (willingness to pay for a boat), we can relate this
linearly to a set of observables Xi as follows

yi∗ = Xi β + εi

where εi is assumed to be iid, conditional on X, according to some symmetric distribution


with CDF F ().

c) The estimable model is specified as follows


 
1 if yi∗ = Xi β + ε ≥ 0 1 if ε ≥ −Xi β
 
yi = ⇐⇒ yi =
0 if yi∗ = Xi β + εi < 0 0 if εi < −Xi β
 

Therefore, we can notice that

E(yi | Xi ) = P (yi = 1 | Xi ) = P (ε ≥ −Xi β) = 1 − P (ε ≤ −Xi β) = 1 − F (−Xi β) = F (Xi β)

3
where the last step follows from the assumption that the distribution of ε conditional on X
is symmetric.

d) Marginal effects are not constant, indeed

∂ P (yi = 1 | Xi ) ∂
= F (Xi β) = βj · f (Xi β)
∂ xij ∂ xij

i.e. marginal effects depend on Xi through the pdf of ε, f .

4
Applied Questions

Question 3

Consider a model where the dependent variable is a woman’s working choice



1

if the woman works
yi =
0

if the woman does not work

The estimated model is the following

yi = β0 + β1 agei + β2 educi + β3 childreni + εi

The summary statistics from the data, and the output from a logit model are reported below.

5
a) Comment on the estimated parameters. Are they statistically significant?

b) What test would you use to evaluate whether the whole estimated model is statistically
significant?

Solution

a) The significance of each coefficient is evaluated by means of t-tests. Given the reported
p-values associated to the t-tests, all coefficients (except the intercept) are statistically sig-
nificant at 1% level.

b) The joint significance of the estimated coefficients (excluding the intercept) is evaluated by
means of a likelihood ratio (LR) test.

The LR test statistic is given by

H0
LR = 2 [ln(L1 ) − ln(L0 )] ≈ χ2K

where ln(L1 ) is the maximised log-likelihood of the full model, and ln(L0 ) is the maximised log-
likelihood of a model where only the constant is included. The LR statistic is asymptotically
distributed as a χ2r with r degrees of freedom, where r is the number of restrictions that are
imposed by the null. In our case, r = K, where K is the number of regressors (excluding the
constant), as the null imposes that all coefficients except the constant are equal to zero.

In our case, we reject the null on no overall significance of the model, i.e. we go in favor of
the alternative that the estimated model is overall significant.

6
Question 4

Consider the previous model, now estimated by a probit.

a) What is the difference between a probit and a logit model?

b) Derive the marginal effect of education on the dependent variable.

7
c) Is the model overall significant?

Solution

a) A probit model assumes that the errors in the (latent) theoretical model are distributed accord-
ing to a standard normal, whereas a logit model assumes that they are distributed according
to a logistic.

b) The marginal effect of education is given by

∂ P (yi = 1 | Xi ) ∂
= Φ (Xi β) = βeduc · ϕ (Xi β)
∂ educi ∂ educi

where Φ is the standard normal CDF and ϕ is the standard normal pdf.

Evaluated at mean (Xi = X̄), this marginal effect is approximately 0.04, i.e. an additional
year of education is associated with an increase in the probability that a married woman is
working of around 4 percentage points.

c) The model is overall significant according to the reported LR test of overall significance: the
LR statistic is 41.767 and the associated p-value is less than 1%.

8
Question 5

Now consider the marginal effects from the logit model estimated in exercise 3

a) Derive the marginal effect of education on the dependent variable in the logit case.

b) Can we give any direct interpretation to the estimated coefficients βj ? What is the value of
the estimated coefficient on the education variable?

Solution

a) In the logit case,


 
∂ P (yi = 1 | X) ∂ exp (Xi β) exp (Xi β)
= = βeduc
∂ educi ∂ educi 1 + exp (Xi β) [1 + exp (Xi β)]2

b) The estimated coefficient on the education variable is β̂educ = 0.16067. It can directly be
interpreted as the marginal effect in the latent model.

Additionally, since in the estimated model the marginal effect is

∂ P (yi = 1 | Xi ) ∂
= F (Xi β) = βeduc · f (Xi β)
∂ educi ∂ educi

and the density f (Xi β) is always positive, the sign of the estimated coefficient and that of
the corresponding marginal effect are the same. That is, we can already say, by looking only
at the estimated coefficient, that the estimated marginal effect of education will be positive.
To determine its exact magnitude we need instead to pick a sample point (e.g. the mean)
and compute it.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy