0% found this document useful (0 votes)
5 views6 pages

Finalexam Solutions v2 2016

The document provides solutions to a final exam for an econometrics course, detailing various regression analyses related to salary determinants for university graduates and employed individuals. It includes evaluations of omitted variable bias, simultaneous causality, and sample selection, along with explanations of advanced econometric methods like weighted least squares and instrumental variables. Additionally, it discusses CEO salary predictions based on firm and CEO characteristics, including gender effects and interactions with age and education.

Uploaded by

bilalmchtgl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views6 pages

Finalexam Solutions v2 2016

The document provides solutions to a final exam for an econometrics course, detailing various regression analyses related to salary determinants for university graduates and employed individuals. It includes evaluations of omitted variable bias, simultaneous causality, and sample selection, along with explanations of advanced econometric methods like weighted least squares and instrumental variables. Additionally, it discusses CEO salary predictions based on firm and CEO characteristics, including gender effects and interactions with age and education.

Uploaded by

bilalmchtgl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

EC331 – Econometrics NAME: _____________________

Fall 2016 ID #: _______________________


FINAL EXAM - Solutions
January 2, 2017 (100+10 points, 125 minutes)
Notes:
- Calculation errors did not cost you any points.
- You lost 1 or 2 points in log regressions for writing %.0x instead of %x.
- If you forgot the ‘hat’ in the bonus question, you lost 3 points.
- In 1-c if you did not clearly express that we keep staff size the same, you lost 2
points.
- For 1-e and 2-d grading was commensurate with the effort you have put in figuring
out the ways the issues may arise.
- In 3, you are graded for the two best answers. Hence, if you responded correctly to
WLS and IV you have received 15 points.

You went home for the New Year and told everyone how your life has changed during the Fall
semester, after you have taken EC 331. A high school student that you know is unconvinced. S/he
says that things learned at school cannot have any practical use at all. You decided to help this
kid with your new skills. You already know that what matters in a person’s life is how much that
person earns and decide to explore what determines one’s salary.
1. (30) Your first task is to figure out which university would be the best choice. You have
data for 180 universities in Turkey and run a set of regressions (see Table below-standard
errors in parenthesis). Dependent variable is the log of average monthly salary of the
graduates and independent variables are a dummy for being located in a metropolitan
area, log of enrollment, log of staff size, ratio of those who dropped out of the university
and ratio of those who are making a graduate study.
1 2 3
Metropol 0.825 (.200) 0.625 (165) 0.589 (.165)
Log(staff) 0.0874 (0.0073) 0.0881 (0.0073)
Log(enroll) -.222 (0.050) -0.218 (0.050)
Droprate -.00028 (0.00161)
Gradrate 0.00097 (0.00066)
Intercept 10.523 (0.042) 10.884 (0.252) 10.738 (0.258)
R_square 0.040 .353 .361
a. (7) Is the second regression better than the first one? Base your judgement on three
different criteria: one using the ‘Metropol’ variable’s coefficient, one using one of the
added variables, and one based on general fit of the data. Be clear, precise and brief in
your arguments.
The second specification appears to be better because:

- We observe a change in the coefficient for Metropol, which signals that first
specification may be subject to omitted variables bias.

- Added variables are significant (have a t-stat higher than 2)

- R_square (as well as adjusted R_square) is much larger for the second specification

b. (5) Express in words what the third specification implies about the relation between
average salary and staff size.

1% increase in staff size would be expected to increase average salary by 0.0881%


on average.

c. (5) Why is the coefficient for the enrollment is negative? Give a reasoning for that.

Note that we control for the staff size. If the enrolment increases keeping staff size
the same, quality of education decreases and this results in lower salaries for
graduates.

d. (5) Assume that error term is homoscedastic. Describe how you would test that droprate
and gradrate are jointly significant.

We can use the R_square in specification 2 (restricted regression) and in


specification 3 (unrestricted regression) to calculate F statistic (see MT2 solutions for
the formula). Then we can compare it with the critical value.

e. (8) Evaluate your study based on the following criteria (three sentences at most for each):

- Omitted variable bias


- Simultaneous causality
- Sample selection

There may be some omitted variables such as the average ability of students, which
would affect the salary and be correlated with the graduation rate.

As to simultaneous causality, higher average salary may attract more students and
result in higher enrolment. It may as well push students to work harder and not to
drop out.

Assuming that all universities in the country are included, we have no concerns in
that respect.
2. (25) Now you need to pick which industry you should recommend for employment. You
have four choices, finance, consumer products, utilities, and transportation. You find a
sample of employed individuals whose salary and characteristic of the job they are
employed in is available.

a. (5) You first run a regression of log salary on four dummies (one for each industry)
but the software does not give any results. Why did this happen?

Including one dummy for each category would result in perfect multicollinearity.

The student propose to drop out one variable and try again. ‘Maybe, s/he says, your
computer can not process so many variables and that is the reason why we do not get
any result’. You try again dropping the dummy variable for transportation and this
time it works well. Here are the results (standard errors in parenthesis):
log(salary) = 4.59 + .257 log(sales) + .011 roe + .158 finance - .181 consprod — .283
utility
(0.30) (.032) (.004) (.089) (.085) (.099)
n = 209. R 2 = .357
b. (5) Well you say, according to the results, estimated salary in utilities is ….. compared
to estimated salary in finance. What should there be instead of “….”? (no need to
perform a hypothesis test, just write down the correct quantity in correct terms).

-.283 - .158 = - .441

“approximately 44.1 % lower”

c. (7) The student is rather curious about finance vs transportation. What does the results
tell about the difference between salaries in finance vs transportation? Is the difference
statistically significant at the 5% level?

Difference is (approximately) 15.8%

t = .158/0.089 = 1.77 not significant at 5% level.

d. (8) Evaluate your study based on the following criteria (three sentences at most for
each):

- Omitted variable bias


- Simultaneous causality
- Sample selection
There may be some omitted variables such as the average ability of students, which
would affect the salary and be correlated with the sales or roe.

If you stretch the reasoning, you may argue that higher average salary may lead to
higher sales. It looks difficult to me that you can economically build the mechanism
as such.

Note that sample is formed of individuals not firms. Hence sample selection bias
would occur if individuals are not sampled randomly but in a particular way. I
would not doubt it a priori without further information.

3. (15) To impress the high-school student, you start talking about some complicated
methods even though you know very little about some of them. Describe in three
sentences the methods listed below also indicating what kind of issue it addresses.

a. Weighted least squares


This is a method where each observation’s dependent and independent variable
is weighted by an observation specific number. Most often used in the presence of
heteroskedasticity of a known form. Also used when the sampling under or over
represent some groups and its size is known.

b. Instrumental variables estimator


When the independent variable is correlated with the error term we would end
up with a biased estimator. Instrumental variables method address the problem
by a third variable which is correlated with the independent variable but not
with the error.

c. Probit estimation
When the dependent variable is a binary variable, regression may ne thought as
modelling the probability that the dependent variable takes the value 1. Then,
OLS may be inappropriate since estimated value may go beyond 0 or 1. Probit
addresses the problem by modeling the case in probabilistic terms and estimating
the coefficient with MLE.

4. (30) The student is an ambitious one and plans to be a CEO in the future. Luckily you also
have data available on CEO salaries and certain company characteristics (annual firm
sales, return on equity in percent form, and return on the firm’s stock in percent form) as
well as CEO characteristics (gender, MBA dummy, age). Regression results are given
below (standard errors are given in parentheses).

Coefficient Stand. Error


Sales -0.28 0.35
Roe 0.0174 0.0041
Ros 0.00024 0.00054
Female -0.713 0.056
MBA 0.402 0.089
Female x MBA 0.231 0.035
Age 0.005 0.0004
Age x MBA -0.0007 0.0002
Constant 4.32 0.32
n = 209 R 2 = .283
a. (5) By what percent is salary predicted to increase, if ros increases by 50
points? Test H0: 50 x coefficient of ros = 0 (do not test H0: coefficient of ros =
0).

1 point increase in ros increases the salary by 0.024%. Then 50 points


would change the salary by 1.2%.

Note that s.e.(50 x coeff. of ros) = 50 x s.e.(coeff. of ros) = 50 x 0.00054 =


0.027. t = 0.012/0.027 < 1.96. Fail to reject Ho.

b. (8) Are the women with an MBA paid less than a man with an MBA (no need
to test)? By how much (in log(salaries)).

Women with MBA would be different from men with MBA by - 0.713 +
0.231 = 0.482 in log(salaries).

c. (5) What is the implicit assumption about the gender effect in the way age and
MBA interaction is incorporated in the model.

The difference between male and female employees in the effect of MBA
does not change with age.

d. (7) What is the estimated difference in log(salaries) between a 45 year old


male CEO with MBA and one who is 65 year old and has also an MBA degree
(assuming all other company characteristics are the same).

45 year old and 65 year old differ by 20 x (0.005 – 0.0007) = 20 x 0.0043 =


0.0086 in log(salaries).

e. (5) Consider three alternative specification for the above. The first one is the
one presented in the table, the second one has log(sales) instead of age and the
third has the sales and the square of it. State two things that you would need to
pay attention to, to make a decision as to which one you should use?

You need to compare R_square and significance of relevant coefficients.


BONUS: (10) Consider the following model:
𝑦𝑖 = 𝛽0 + 𝑢𝑖
Using the least squares method derive the estimator for 𝛽0.
See solutions to MT1.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy