Finalexam Solutions v2 2016
Finalexam Solutions v2 2016
You went home for the New Year and told everyone how your life has changed during the Fall
semester, after you have taken EC 331. A high school student that you know is unconvinced. S/he
says that things learned at school cannot have any practical use at all. You decided to help this
kid with your new skills. You already know that what matters in a person’s life is how much that
person earns and decide to explore what determines one’s salary.
1. (30) Your first task is to figure out which university would be the best choice. You have
data for 180 universities in Turkey and run a set of regressions (see Table below-standard
errors in parenthesis). Dependent variable is the log of average monthly salary of the
graduates and independent variables are a dummy for being located in a metropolitan
area, log of enrollment, log of staff size, ratio of those who dropped out of the university
and ratio of those who are making a graduate study.
1 2 3
Metropol 0.825 (.200) 0.625 (165) 0.589 (.165)
Log(staff) 0.0874 (0.0073) 0.0881 (0.0073)
Log(enroll) -.222 (0.050) -0.218 (0.050)
Droprate -.00028 (0.00161)
Gradrate 0.00097 (0.00066)
Intercept 10.523 (0.042) 10.884 (0.252) 10.738 (0.258)
R_square 0.040 .353 .361
a. (7) Is the second regression better than the first one? Base your judgement on three
different criteria: one using the ‘Metropol’ variable’s coefficient, one using one of the
added variables, and one based on general fit of the data. Be clear, precise and brief in
your arguments.
The second specification appears to be better because:
- We observe a change in the coefficient for Metropol, which signals that first
specification may be subject to omitted variables bias.
- R_square (as well as adjusted R_square) is much larger for the second specification
b. (5) Express in words what the third specification implies about the relation between
average salary and staff size.
c. (5) Why is the coefficient for the enrollment is negative? Give a reasoning for that.
Note that we control for the staff size. If the enrolment increases keeping staff size
the same, quality of education decreases and this results in lower salaries for
graduates.
d. (5) Assume that error term is homoscedastic. Describe how you would test that droprate
and gradrate are jointly significant.
e. (8) Evaluate your study based on the following criteria (three sentences at most for each):
There may be some omitted variables such as the average ability of students, which
would affect the salary and be correlated with the graduation rate.
As to simultaneous causality, higher average salary may attract more students and
result in higher enrolment. It may as well push students to work harder and not to
drop out.
Assuming that all universities in the country are included, we have no concerns in
that respect.
2. (25) Now you need to pick which industry you should recommend for employment. You
have four choices, finance, consumer products, utilities, and transportation. You find a
sample of employed individuals whose salary and characteristic of the job they are
employed in is available.
a. (5) You first run a regression of log salary on four dummies (one for each industry)
but the software does not give any results. Why did this happen?
Including one dummy for each category would result in perfect multicollinearity.
The student propose to drop out one variable and try again. ‘Maybe, s/he says, your
computer can not process so many variables and that is the reason why we do not get
any result’. You try again dropping the dummy variable for transportation and this
time it works well. Here are the results (standard errors in parenthesis):
log(salary) = 4.59 + .257 log(sales) + .011 roe + .158 finance - .181 consprod — .283
utility
(0.30) (.032) (.004) (.089) (.085) (.099)
n = 209. R 2 = .357
b. (5) Well you say, according to the results, estimated salary in utilities is ….. compared
to estimated salary in finance. What should there be instead of “….”? (no need to
perform a hypothesis test, just write down the correct quantity in correct terms).
c. (7) The student is rather curious about finance vs transportation. What does the results
tell about the difference between salaries in finance vs transportation? Is the difference
statistically significant at the 5% level?
d. (8) Evaluate your study based on the following criteria (three sentences at most for
each):
If you stretch the reasoning, you may argue that higher average salary may lead to
higher sales. It looks difficult to me that you can economically build the mechanism
as such.
Note that sample is formed of individuals not firms. Hence sample selection bias
would occur if individuals are not sampled randomly but in a particular way. I
would not doubt it a priori without further information.
3. (15) To impress the high-school student, you start talking about some complicated
methods even though you know very little about some of them. Describe in three
sentences the methods listed below also indicating what kind of issue it addresses.
c. Probit estimation
When the dependent variable is a binary variable, regression may ne thought as
modelling the probability that the dependent variable takes the value 1. Then,
OLS may be inappropriate since estimated value may go beyond 0 or 1. Probit
addresses the problem by modeling the case in probabilistic terms and estimating
the coefficient with MLE.
4. (30) The student is an ambitious one and plans to be a CEO in the future. Luckily you also
have data available on CEO salaries and certain company characteristics (annual firm
sales, return on equity in percent form, and return on the firm’s stock in percent form) as
well as CEO characteristics (gender, MBA dummy, age). Regression results are given
below (standard errors are given in parentheses).
b. (8) Are the women with an MBA paid less than a man with an MBA (no need
to test)? By how much (in log(salaries)).
Women with MBA would be different from men with MBA by - 0.713 +
0.231 = 0.482 in log(salaries).
c. (5) What is the implicit assumption about the gender effect in the way age and
MBA interaction is incorporated in the model.
The difference between male and female employees in the effect of MBA
does not change with age.
e. (5) Consider three alternative specification for the above. The first one is the
one presented in the table, the second one has log(sales) instead of age and the
third has the sales and the square of it. State two things that you would need to
pay attention to, to make a decision as to which one you should use?