PS5 Econ320 2024
PS5 Econ320 2024
Problem set 5
Due at 11.59pm on November 26th
1. In this problem we derive the expression for including “irrelevant variables” in the
regression and see more formally what it does to the variance. Assume as in the
lecture notes that all variables are demeaned, i.e., that X̄ = Z̄ = Ȳ = 0, where
X̄ = ni=1 Xi . We assume the true model is
P
Yi = βXi + εi , (1)
and we define the corresponding regressor as β̂. The model we actually estimate is
Yi = βXi + γZi + ui , (2)
and we define the corresponding regressor as β̃. Moreover, we assume that Xi and
Zi are deterministic (non-random), so that our Gauß-Markov assumptions on
the regression model look like
E (εi ) = 0 and Var (εi ) = σ 2
as well as E[εi εj ] = 0 for i ̸= j. Throughout, all random variables are sampled iid.
We will solve this problem together step by step.
(a) Set up the optimization problem for the OLS estimators β̃, γ̃ in the model (2).
(b) Show that the first order optimality conditions are
n
X
(Yi Xi − βXi2 − γZi Xi ) = 0
i=1
Xn
(Yi Zi − βXi Zi − γZi2 ) = 0.
i=1
Var (( i Zi2 ) ( i Xi Yi ) − ( i Xi Zi ) ( i Zi Yi ))
P P P P
Var(β̃) = 2 .
( i Xi2 ) ( i Zi2 ) − ( i Xi Zi )2
P P P
Zi2 ) ( i Xi εi ) − ( i Xi Zi ) ( i Zi εi ))
P P P P
Var (( i
Var(β̃) = 2 .
( i Xi2 ) ( i Zi2 ) − ( i Xi Zi )2
P P P
P
(j) This is the variance of β̃. Now assume that i Xi Zi = 0. Show that
σ2
Var(β̃) = P 2 .
i Xi
Yi = βXi + εi .
Xi = γZi + ui
(a) Write out the Wald estimator in summation form (NOT in matrix form).
(b) Denote the fitted values of Xi from the first stage as X̂i = γ̂Zi . Write out the
2-SLS estimator in summation form.
(c) Show that β̂ Wald = β̂ 2SLS . [Hint: What can you say about
P
i (Xi − X̂i )Zi ? Use
the normal equations.]
(i) Why is the variable 1(small) endogenous now? Explain in no more than 5 sen-
tences.
(ii) Argue in no more than 5 sentences that 1(initial small) is a valid instrument.
4. Suppose you want to estimate the effect of a change in tax laws in the state of
fictitious Georgia, where in 2022, the income tax brackets were lowered. You want to
estimate the effect of this policy change on the average household income in fictitious
Georgia. Assume you have the data of average household income in all 50 states of
the US from January 2010 to November 2024. Tell me in no more than 8 sentences
how you would conceptually estimate this causal effect. In particular, tell me what
assumptions you would make in order to make sure you actually get at the true causal
effect.